[MERGE] Make annotate behave in a non-ASCII world
Goffredo Baroncelli
kreijack at tiscalinet.it
Wed Jul 11 19:27:29 BST 2007
On Wednesday 11 July 2007, John Arbash Meinel wrote:
> So the problem is that we need to write some things is sys.stdout.encoding
> (with errors='replace') and some things as 8-bit strings.
>
Yes, this is the same problem of the diff (the paths are unicode, the file
content is raw 8-bit data).
I encountered the same problem in testing my patch "[RFC][PATCH] Show the diff
in the commit messages": the diff function uses a stream to pass the data to
the higher function.
I think that it is time to develop a common wrapper for this kind of
situation: something like
class UnicodeStringIO(StringIO.StringIO):
def __init__(self, buf='', decoding='utf8'):
StringIO.StringIO.__init__(self, buf)
self._usio_decoding = decoding
def write(self, s):
if not isinstance(s, unicode):
s = s.decode(self._usio_decoding, "replace")
StringIO.StringIO.write(self, s)
this class returns always an unicode string. If the string passed to this
wrapper is an internal bazaar [meta]data (as user name or path which are
unicode string), the data are stored "as is". If the data is a string, the
data are _decoded_ with the selected encoding, then it is stored.
Any thougts
--
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack at inwind.it>
Key fingerprint = CE3C 7E01 6782 30A3 5B87 87C0 BB86 505C 6B2A CFF9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unicodestringio.py
Type: application/x-python
Size: 1634 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070711/a0e2ad90/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070711/a0e2ad90/attachment.pgp
More information about the bazaar
mailing list