[rfc] bencode unicode strings
Andrew Bennetts
andrew.bennetts at canonical.com
Tue Jun 16 09:25:12 BST 2009
Alexander Belchenko wrote:
> For QBzr needs I need to implement support for bencoding unicode
> strings. Standard bencode uses strings in the stream as byte streams.
>
> Because Qt internally works with pure unicode (not utf-8 as gtk) then I
> have to encode strings to utf-8 manually. I want bencode to handle this
> for me.
>
> I'm not sure how it handled in revision serializer and does it makes
> sense to have such support in the core?
The bencode format only has the concept of byte-strings, not unicode. So
currently you need to explicitly encode (and decode). Because bzr internally
tends to use utf-8 a fair bit this hasn't been a big burden so far... I'm
curious to know why explicitly encoding and decoding is so much more of an issue
for QBzr? Perhaps naïvely I would have expected that data like commit messages
and committer names would already be encoded/decoded for you by bzrlib. Why is
QBzr touching revision serialisation directly?
-Andrew.
More information about the bazaar
mailing list