[BUG] bzr changeset generation fails with non-ascii characters

Aaron Bentley aaron.bentley at utoronto.ca
Fri Jul 15 21:50:57 BST 2005


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

My python installation thinks 'ascii' is a good character encoding, and
who am I to argue?  This means that William Dodé is my constant nemesis,
because wherever his distinctly non-ascii name appears, trouble is sure
to follow.

In this case, I get an error with bzr changeset (full traceback below).
  Essentially, it says that  bzrlib.diff.internal_diff can't convert
0xc3 (acute e) to ASCII.  That sounds fair enough, but what may not be
obvious here is that it shouldn't need to.  iternal_diff should be
operating in a binary/8-bit fashion on all sequence data-- otherwise,
you can get lossy character conversions, or errors because a certain
Unicode codepoints are undefined.  Bzr isn't interested in these files
as text; it's their byte streams that matter.

So we need to figure out what is provoking unicode handling of this
data, and get it to use and 8-bit, encoding-ignorant approach instead.

Aaron

Here is
[30885] the original traceback:
[30885]
[30885] Traceback (most recent call last):
[30885]   File "/home/abentley/bzr.bugfix/bzrlib/commands.py", line
1813, in main
[30885]     return run_bzr(argv)
[30885]   File "/home/abentley/bzr.bugfix/bzrlib/commands.py", line
1789, in run_bzr
[30885]     return cmd_class(cmdopts, cmdargs).status
[30885]   File "/home/abentley/bzr.bugfix/bzrlib/commands.py", line 202,
in __init__
[30885]     self.status = self.run(**cmdargs)
[30885]   File
"/home/abentley/.bzr.conf/plugins/bzr-changeset/__init__.py", line 113,
in run
[30885]     to_file=outf, include_full_diff=verbose)
[30885]   File
"/home/abentley/.bzr.conf/plugins/bzr-changeset/gen_changeset.py", line
447, in show_changeset
[30885]     meta.write_meta_info(to_file)
[30885]   File
"/home/abentley/.bzr.conf/plugins/bzr-changeset/gen_changeset.py", line
266, in write_meta_info
[30885]     self._write_diffs()
[30885]   File
"/home/abentley/.bzr.conf/plugins/bzr-changeset/gen_changeset.py", line
394, in _write_diffs
[30885]     self.to_file)
[30885]   File "/home/abentley/bzr.bugfix/bzrlib/diff.py", line 67, in
internal_diff
[30885]     to_file.write(line)
[30885]   File "/home/abentley//lib/python2.3/codecs.py", line 178, in write
[30885]     data, consumed = self.encode(object, self.errors)
[30885] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
position 36: ordinal not in range(128)
[30885]
[30885]
[30885] finished, 2.290u/0.210s cpu, 0.000u/0.000s cum, 2.620 elapsed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFC2CGx0F+nu1YWqI0RAlyEAJ4xdMV2QS4zFTF36wVn5HHt0wWQ+gCfRgD3
ZmZoJSj3Gl9x0uCFmQlDqco=
=VwGO
-----END PGP SIGNATURE-----




More information about the bazaar mailing list