[rfc] bencode unicode strings

Andrew Bennetts andrew.bennetts at canonical.com
Tue Jun 16 10:49:14 BST 2009


Alexander Belchenko wrote:
> Andrew Bennetts пишет:
>> Alexander Belchenko wrote:
>>> For QBzr needs I need to implement support for bencoding unicode   
>>> strings. Standard bencode uses strings in the stream as byte streams.
>>>
>>> Because Qt internally works with pure unicode (not utf-8 as gtk) then 
>>> I  have to encode strings to utf-8 manually. I want bencode to handle 
>>> this  for me.
>>>
>>> I'm not sure how it handled in revision serializer and does it makes  
>>> sense to have such support in the core?
>>
>> The bencode format only has the concept of byte-strings, not unicode.  So
>> currently you need to explicitly encode (and decode).
>
> Am I not write exactly the same in my first mail?

Close.  I wanted to emphasise that adding what you want to bencode would
mean an incompatible change to the bencode format.

[...]
> See my explanations above: I have unicode strings everywhere in Qt and I  
> need to store some of these strings in config files (e.g. qbzr.conf,  
> branch.conf or tree.conf). Configobj requires me to provide an unicode  
> strings to store, so I have to do double conversion here:
>
> 1) Convert my unicode strings to utf-8
> 2) Bencode them
> 3) Convert bencoded result from utf-8 to unicode
> 4) Store them in conf file via ConfigObj
>
> And the things are much worse when I need to bencode dicts or lists.
> Something is wrong here, ne's pas?
>
> I hope my intent is more clear now.

Oh, I see.  So this is for the inter-process communication that QBzr does?
And/or for QBzr-specific values you are storing in configuration files?  Or
something else?

-Andrew.




More information about the bazaar mailing list