First bazaar-ng experience

John A Meinel john at arbash-meinel.com
Thu Jun 30 02:40:21 BST 2005


Martin Pool wrote:

>On 29 Jun 2005, John A Meinel <john at arbash-meinel.com> wrote:
>
>
>
>>Just add these lines right after it:
>>if user_encoding is None:
>>   user_encoding='UTF-8'
>>
>>
>
>Good catch.
>
>
>
>>I think that will at least stop it from failing on you.
>>Now the trick is to figure out what the default encoding should be if
>>getpreferredencoding() should fail.
>>On my linux box, it seems to be "UTF-8", under cygwin it is "US_ASCII",
>>on windows it seems to be 'cp1252'.
>>On *my* mac it seems to be 'mac-roman'.
>>
>>Now, we could default it based on sys.platform, but really the point is
>>that the system should be setting a preferred encoding.
>>I'm not sure why your mac is not.
>>
>>I would actually argue that the best fallback encoding is utf-8, since
>>it means you should be able to encode any international characters.
>>
>>
>
>It'll produce invalid output on 3 of the 4 example platforms you list :(
>
>I think ascii is probably the only safe cross-platform default.  It
>seems like a Python bug if locale.getpreferredencoding returns None.
>
>
>
Well, on those 3 platforms locale.getpreferredencoding doesn't return
None. :) Yes, you can return ASCII if you prefer.
If you read the documentation it states that it can be None if python
cannot figure out what it should be. The problem is that it doesn't say
it where you would expect it:

*getlocale*( 	[category])

    Returns the current setting for the given locale category as
    sequence containing language code, encoding. category may be one of
    the LC_* values except LC_ALL. It defaults to LC_CTYPE.

    Except for the code |'C'|, the language code corresponds to RFC 1766
    <http://www.faqs.org/rfcs/rfc1766.html>. language code and encoding
    may be |None| if their values cannot be determined. New in version 2.0.

*getpreferredencoding*( 	[do_setlocale])

    Return the encoding used for text data, according to user
    preferences. User preferences are expressed differently on different
    systems, and might not be available programmatically on some
    systems, so this function only returns a guess.

    On some systems, it is necessary to invoke setlocale to obtain the
    user preferences, so this function is not thread-safe. If invoking
    setlocale is not necessary or desired, do_setlocale should be set to
    |False|.

    New in version 2.3.

Notice that in 'getlocale' it states that encoding may be None. It
doesn't state that explicitly in getpreferredencoding().

Also, reading the documents it seems that we should be calling
"setlocale(LC_ALL, '')" somewhere in either bzr or maybe
commands.main(). (We shouldn't do it elsewhere since bzrlib is going to
be a library, not just a program).

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050629/cfe6e84f/attachment.pgp 


More information about the bazaar mailing list