[rfc] bencode unicode strings

Alexander Belchenko bialix at ukr.net
Tue Jun 16 09:52:14 BST 2009


Andrew Bennetts пишет:
> Alexander Belchenko wrote:
>> For QBzr needs I need to implement support for bencoding unicode  
>> strings. Standard bencode uses strings in the stream as byte streams.
>>
>> Because Qt internally works with pure unicode (not utf-8 as gtk) then I  
>> have to encode strings to utf-8 manually. I want bencode to handle this  
>> for me.
>>
>> I'm not sure how it handled in revision serializer and does it makes  
>> sense to have such support in the core?
> 
> The bencode format only has the concept of byte-strings, not unicode.  So
> currently you need to explicitly encode (and decode).

Am I not write exactly the same in my first mail?

>  Because bzr internally
> tends to use utf-8 a fair bit this hasn't been a big burden so far... I'm
> curious to know why explicitly encoding and decoding is so much more of an issue
> for QBzr?

Because *all* strings in Qt dialogs *are* unicode. And when I need to 
store them in bencoded object I have to deal with encode/decode to/from 
utf-8.

>  Perhaps naïvely I would have expected that data like commit messages
> and committer names would already be encoded/decoded for you by bzrlib.  Why is
> QBzr touching revision serialisation directly?

QBzr don't need to even know about revision serializer.
I'm sorry, my English is again below the level when people can 
understand me and out of any critics.

See my explanations above: I have unicode strings everywhere in Qt and I 
need to store some of these strings in config files (e.g. qbzr.conf, 
branch.conf or tree.conf). Configobj requires me to provide an unicode 
strings to store, so I have to do double conversion here:

1) Convert my unicode strings to utf-8
2) Bencode them
3) Convert bencoded result from utf-8 to unicode
4) Store them in conf file via ConfigObj

And the things are much worse when I need to bencode dicts or lists.
Something is wrong here, ne's pas?

I hope my intent is more clear now.




More information about the bazaar mailing list