[rfc] bencode unicode strings

Alexander Belchenko bialix at ukr.net
Tue Jun 16 16:01:19 BST 2009


Thanks for the hints.

Aaron Bentley пишет:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Alexander Belchenko wrote:
>> See my explanations above: I have unicode strings everywhere in Qt and I
>> need to store some of these strings in config files (e.g. qbzr.conf,
>> branch.conf or tree.conf). Configobj requires me to provide an unicode
>> strings to store
> 
> No, Configobj can work in bytestring or unicode modes.

AFAICT it's not quite true re writing non-ascii data to file.
According to ConfigObj documentation:

"When using the write method, ConfigObj uses the encoding attribute to 
encode the Unicode strings. If any members (or keys) have been set as 
byte strings instead of Unicode, these must first be decoded to Unicode 
before outputting in the specified encoding.

default_encoding, if specified, is the encoding used to decode byte 
strings in the ConfigObj before writing. If this is None, then the 
Python default encoding (sys.defaultencoding - usually ASCII) is used."

So I'm doubt one can write any byte stream to config file (because 
configobj will try to decode these bytes to unicode under the hood).

> 
>> , so I have to do double conversion here:
> 
> If you're storing them in config files, you should generally store
> human-readable values.  ConfigObj supports dict of (dicts of) unicode
> strings-- isn't that sufficient?

Do you mean standard ConfigObj interface as dict-like object?
Yep, it will work, just require to use ConfigObj directly and bypassing 
internal config classes from bzrlib.

I'm just need to figure out how to get file name for specific 
branch/tree conf file (e.g. branch.conf).

> 
>> 1) Convert my unicode strings to utf-8
>> 2) Bencode them
>> 3) Convert bencoded result from utf-8 to unicode
> 
> I think this may be invalid.
> 
> Aaron.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iEYEARECAAYFAko3p44ACgkQ0F+nu1YWqI0TTwCfQyXlT+lg/cpa+NEPWcVatl+z
> ikoAn2KcADegtov1aaIYrB3Z9ZnTVRJ9
> =a4Ts
> -----END PGP SIGNATURE-----
> 
> 




More information about the bazaar mailing list