[MERGE] UTF-8 encoding in binary diffs
Goffredo Baroncelli
kreijack at tiscalinet.it
Tue Jul 10 19:52:26 BST 2007
On Tuesday 10 July 2007, Martin Pool wrote:
> Martin Pool has voted -0.
> Status is now: Waiting
> Comment:
> I think changing to %s rather than %r is definitely right.
>
> And I think encoding just the filename, rather than the whole stream, is
> also probably right, as it will give better results if we don't know the
> encoding of the contents of the file.
>
> My only query is whether we should be hardcoding utf-8 here. Shouldn't
> we be putting the filenames into the user's encoding?
Internally the paths in bazaar are unicode (not __encoded__).
The problem in the function _show_diff_trees() is that they mixes userdata
(which is already __encoded__) and the paths which aren't __encoded__. I
don't know if it is the only one case.
Because I use this function inside in my web interface (webserve) I suggest to
pass another parameters to the function: the encoding of the unicode data
(the filepath).
So, for an internal use (as webserve), this function encodes the data as utf8;
for the external use (the bzr diff command for example) the encoding is
the "user's encoding".
I am guessing how this function can _now_ work on non utf8 environment (ie
windows). I have two hypotesis:
1) it doesn't work and display an utf8 encoded stream in a non utf8
environment
2) it works, because on the stdout the utf8 encoded stream is converted to the
local encoding
>
> For details, see:
>
http://bundlebuggy.aaronbentley.com/request/%3Cd06a5cd30707090048l588e49c3wdfb8aaf3379f3b5e%40mail.gmail.com%3E
>
>
--
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack at inwind.it>
Key fingerprint = CE3C 7E01 6782 30A3 5B87 87C0 BB86 505C 6B2A CFF9
More information about the bazaar
mailing list