version-info --include-history UnicodeDecodeError (518609)

Thu Apr 8 22:23:47 BST 2010

On 06/04/2010, Robert Collins <robertc at robertcollins.net> wrote:
>
> RIO is a low level encoding: it should be outputting RIO (which builds
> on utf8), not unicode.
>
> That is to say - if the console were to be in shift-JIS or something,
> outputting unicode would *not* result in correct RIO output.

That's funny, the only documentation I found in the code talks about
"Unicode" but doesn't mention UTF-8 at all. I don't interpret "The
format itself does not deal with character encoding issues, though the
result will normally be written in Unicode." as a ban on printing
readable text on a CP932 console.

> bzr's log is defined as being utf8, so we shouldn't need a replace
> statement there.

Is it? So, when I type `bzr log` on my console and get readable text
rather than UTF-8 mangling, that's a bug?

Personally, I find the current state of affairs unacceptable. Because
many bazaar developers use UTF-8 consoles and files with English-only
text, it's "easy" to define various operations and 'internal'
bytestrings as being UTF-8 without actually ensuring that's the case,
or that it leads to sensible behaviour for anyone else. But y'all seem
satisfied with a long stream of similar bug reports from Japanese
users about these broken assumptions.

This particular bug is a regression of sorts, the operation used to
give mojibake, and now throws (attached,
bzr_version_info_failure.log), note also log behaves as (un?)expected.

Parth: there should be a blackbox test along the lines of that console
session with any fix.

Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bzr_version_info_failure.log
Type: application/octet-stream
Size: 3894 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20100408/1efccd94/attachment.obj