Bug: Commit message containing control characters
Harald Meland
harald.meland at usit.uio.no
Mon Sep 5 08:01:17 BST 2005
[Martin Pool]
> On 05/09/05, Harald Meland <harald.meland at usit.uio.no> wrote:
>> > Am I saying anything different? But the context before was about not being
>> > able to represent \x01 in XML -- which is possible using .
I think it was Jan Hudec who said the above; it certainly wasn't
me. :-)
> I guess we have the option of storing literally "\x01", ie the four
> characters BACKSLASH X ZERO ONE, or something along those lines.
That's what the patch I posted a few days ago does
(http://patchwork.ozlabs.org/bazaar-ng/patch?id=2236). The patch is a
little ugly as it stands, due to me not being sure whether this was
the right way to attack the problem -- but I'd be happy to clean it
up.
> It is a bit like inventing our own syntax.
Yeah, and it is a non-reversible transformation of the message text.
> I really think of the commit message as text, not binary data, and
> so not something that should be containing non-whitespace control
> characters.
I agree.
> Perhaps we should just do this when taking the commit message in,
> and not worry about unescaping, or even squash them to '?' (as
> Python can do with unrepresentable characters). That would at least
> stop the exception.
My patch does this -- but for anything that uses
bzrlib.xml.pack_xml(), not just commit messages.
Maybe it would be better to do the message escaping in
bzrlib.revision.Revision.to_element(), and patch pack_xml() so that it
raises an error if it finds any non-escaped characters?
--
Harald
More information about the bazaar
mailing list