problems with encodings for signed commits
Aaron Bentley
aaron.bentley at utoronto.ca
Thu Dec 29 05:24:42 GMT 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dafydd Harries wrote:
| While importing a baz branch into bzr recently, I discovered that the
| testament code fails if a commit message contains non-ascii characters.
| There is code that checks that the method doesn't return a unicode
object, but
| it's guarded by an "if __debug__", which I consider to be a bit odd.
|
| The unicode object in question originates in Commit.commit:
No, it doesn't. It originates in baz_import.iter_import_version:
~ commitobj.commit(branch, log_message.decode('ascii', 'replace'),
~ verbose=False, committer=log_creator,
~ timestamp=timestamp, timezone=0, rev_id=rev_id)
|
| if isinstance(message, str):
| message = message.decode(bzrlib.user_encoding)
This is bogus. If Commit.commit gets a bytestring, it should treat it
as ascii-- there's no defined encoding. Assuming that this bytestring
is in the user encoding is not right. This should be done in
cmd_commit, where we know that the bytestring came from the user, and
therefor the user's encoding applies.
| http://muse.19inch.net/~daf/bzr/bzr/devel/
I don't think this is right. The testament should be built assuming its
contents are unicode, or else all fields should be automatically
converted to utf-8. No conditionals.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFDs3Ma0F+nu1YWqI0RAiRaAKCJC81rkQyTcEzzbLiUJWLpv+jEswCfc9d9
OI4gAYlFusAb3t8ygo5YdWU=
=zhZS
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list