Bug: Commit message containing control characters
Robert Collins
robertc at robertcollins.net
Fri Sep 2 11:30:59 BST 2005
On Fri, 2005-09-02 at 01:56 +0200, Harald Meland wrote:
> \x01bar</message>\n</revision>\n'
Maybe my XML foo isn't where it needs to be, but it looks to me like you
have a literal 0x01 in the string, _not_ . (\x01 is the python
escaping for control characters in strings).
So while I agree that the XML 1.0 spec doesn't allow for , I'm not
convinced that that is the problem here... I'll bet that elementtree
isn't serialising this properly for us.
And in fact, it isn't:
File "./foo.py", line 5, in ?
print bzrlib.xml.unpack_xml(Revision, StringIO(xml))
File "/home/robertc/source/baz/bzr-test-fixes/bzrlib/xml.py", line 40,
in unpack_xml
return cls.from_element(ElementTree().parse(f))
File
"/home/robertc/source/baz/bzr-test-fixes/bzrlib/util/elementtree/ElementTree.py", line 583, in parse
parser.feed(data)
File
"/home/robertc/source/baz/bzr-test-fixes/bzrlib/util/elementtree/ElementTree.py", line 1242, in feed
self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: reference to invalid character number:
line 2, column 12
which is a different exception than you reported - and what I would have
expected.
So there are two issues here:
1) our xml serialisation does not seem as robust as we might like. (the
literal control character was embedded in the message, not the entity).
2) as you have noted, even correctly represented, you cannot use  in
xml 1.0 (outside of a CDATA section).
I think we should solve 1, but I don't have any good ideas for 2 -
sorry.
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050902/12bb2b55/attachment.pgp
More information about the bazaar
mailing list