[rfc] bencode unicode strings

Alexander Belchenko bialix at ukr.net
Tue Jun 16 11:20:55 BST 2009


Alexander Belchenko пишет:
> Andrew Bennetts пишет:
>> Alexander Belchenko wrote:
>>> Andrew Bennetts пишет:
>>>> Alexander Belchenko wrote:
>>>>> For QBzr needs I need to implement support for bencoding unicode   
>>>>> strings. Standard bencode uses strings in the stream as byte streams.
>>>>>
>>>>> Because Qt internally works with pure unicode (not utf-8 as gtk) 
>>>>> then I  have to encode strings to utf-8 manually. I want bencode to 
>>>>> handle this  for me.
>>>>>
>>>>> I'm not sure how it handled in revision serializer and does it 
>>>>> makes  sense to have such support in the core?
>>>> The bencode format only has the concept of byte-strings, not 
>>>> unicode.  So
>>>> currently you need to explicitly encode (and decode).
>>> Am I not write exactly the same in my first mail?
>>
>> Close.  I wanted to emphasise that adding what you want to bencode would
>> mean an incompatible change to the bencode format.
> 
> Not really.
> 
> Python bencode implementation is highly modular, so I can subclass 
> Decoder and extend it to handle unicode strings. And then create 
> additional function, say bdecodeu.
> 
> Similarly, I can extend encoder and to teach it handle unicode.
> And provide new function bencodeu.

Actually this is what I'm planning to implement for QBzr.
My first mail was (bad) attempt to ask is there interest for such thing 
in the core, because of Vincent' suggestions about different common 
things. If not -- then not.




More information about the bazaar mailing list