Bug: Commit message containing control characters

Robert Collins robertc at robertcollins.net
Fri Sep 2 11:30:59 BST 2005


On Fri, 2005-09-02 at 01:56 +0200, Harald Meland wrote:

> \x01bar</message>\n</revision>\n'

Maybe my XML foo isn't where it needs to be, but it looks to me like you
have a literal 0x01 in the string, _not_ &#x1;. (\x01 is the python
escaping for control characters in strings).
 So while I agree that the XML 1.0 spec doesn't allow for &#x1;, I'm not
convinced that that is the problem here...  I'll bet that elementtree
isn't serialising this properly for us.

And in fact, it isn't:

  File "./foo.py", line 5, in ?
    print bzrlib.xml.unpack_xml(Revision, StringIO(xml))
  File "/home/robertc/source/baz/bzr-test-fixes/bzrlib/xml.py", line 40,
in unpack_xml
    return cls.from_element(ElementTree().parse(f))
  File
"/home/robertc/source/baz/bzr-test-fixes/bzrlib/util/elementtree/ElementTree.py", line 583, in parse
    parser.feed(data)
  File
"/home/robertc/source/baz/bzr-test-fixes/bzrlib/util/elementtree/ElementTree.py", line 1242, in feed
    self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: reference to invalid character number:
line 2, column 12

which is a different exception than you reported - and what I would have
expected.

So there are two issues here:
1) our xml serialisation does not seem as robust as we might like. (the
literal control character was embedded in the message, not the entity).
2) as you have noted, even correctly represented, you cannot use &#1; in
xml 1.0 (outside of a CDATA section).

I think we should solve 1, but I don't have any good ideas for 2 -
sorry.

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050902/12bb2b55/attachment.pgp 


More information about the bazaar mailing list