About encoding issues

Jan Hudec bulb at ucw.cz
Mon Apr 24 09:46:58 BST 2006


On Mon, Apr 24, 2006 at 13:06:58 +1000, Martin Pool wrote:
> On 24/04/2006, at 12:59 PM, Martin Pool wrote:
> >>
> >>and see what happens. Unfortunately the errors it gave were pretty  
> >>useless:
> >>
> >>Traceback (most recent call last):
> >>  File "/usr/lib/python2.4/logging/__init__.py", line 739, in emit
> >>  File "/usr/lib/python2.4/encodings/undefined.py", line 22, in  
> >>decode
> >>    raise UnicodeError, "undefined encoding"
> >>UnicodeError: undefined encoding
> >
> >Perhaps we're initializing logging in a way that provokes this?  It  
> >does seem possible to load and use logging with the default  
> >encoding of undefined.
> 
> Of course what's happening is that we currently use ascii literals in  
> Python (regular double-quoted strings) for many of the messages that  
> bzr sends to the user or to the log file.  Both the user's terminal  
> and the log file are potentially unicode in a particular  
> representation, so for a string to get there it needs to first be  
> Unicode and then be encoded in the right way.  However, if the  
> literals are byte strings and implicit conversion is disabled, we  
> can't print them.

Looking at the code involved in the above backtrace, it seems that the
problem is in writing to the output itself.

> We can fix this by either making them u"" unicode literals, or  
> explicitly converting in the places that print them.  It may be  
> worthwhile doing so that we can get errors on implicit conversion of  
> non-ascii strings.

It would be worthwhile to file a wishlist item against python for
file-level 'all literals are unicode' declaration (like perl has).

For the time beingh we will have to add 'u' to all literals anyway.

Note: Recently a design decision in Perl6 is, that autoconversion
between byte buffers and unicode strings was a mistake and won't exist
in perl6 (and literals will be unicode strings).

-- 
						 Jan 'Bulb' Hudec <bulb at ucw.cz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060424/ff034063/attachment.pgp 


More information about the bazaar mailing list