Bug: Commit message containing control characters

Jan Hudec bulb at ucw.cz
Sun Sep 4 22:08:40 BST 2005


On Fri, Sep 02, 2005 at 20:33:54 -0400, Derrick Hudson wrote:
> On 9/1/05, Harald Meland <harald.meland at usit.uio.no> wrote:
> > [Robert Collins]
> > 
> > >> The root of the problem is that the XML 1.0 specification doesn't seem
> > >> to allow encoding of such "control characters" as e.g. "\x01", if I
> > >> understand the the well-formedness constraint here correctly:
> > >>
> > >>   http://www.w3.org/TR/REC-xml/#NT-Char
> > >
> > > &#x1; should work.
> > 
> > I don't think so; the XML 1.0 specification's section "Character and
> > Entity References" (http://www.w3.org/TR/REC-xml/#sec-references)
> > says:
> 
> This is my reading of the spec too -- ASCII control characters (except
> tab, line feed and carriage return) can not be represented natively in
> XML.  One could encode the data as base64 or whatever and put that in
> the XML.  I ran into this problem before when trying to update from an
> obsolete XML parser that wasn't standards compliant to one that
> enforces the spec.

What's wrong with &#1; ? The thing here is, that the xml writer should always
entity-escape everything and the parser should always entity-unescape it.
Unfortunately many don't do it.

--
						 Jan 'Bulb' Hudec <bulb at ucw.cz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050904/41b5509d/attachment.pgp 


More information about the bazaar mailing list