[MERGE][RFC] Add simple revision serializer based on RIO.
Alexander Belchenko
bialix at ukr.net
Mon May 11 15:17:25 BST 2009
John Arbash Meinel пишет:
> Martin Pool wrote:
>> 2009/5/11 Matt Nordhoff <mnordhoff at mattnordhoff.com>:
>>> Martin Pool wrote:
>>> [snip]
>>>
>>>> However, before moving to RIO for future formats (and I say this
>>>> having added the code) I would think hard about whether it should use
>>>> bencode instead, which has the advantage of being able to represent
>>>> somewhat more complex nesting (like dicts inside dicts) without
>>>> needing a separate layer of encoding on top. Revisions are pretty
>>>> simple but even there it may be useful. I'm not sure about the
>>>> relative performance.
>>> If I understand correctly, RIO is line-based but bencode is not. Is the
>>> delta format still line-based? If so, using bencode would be more difficult.
>> We don't do line-by-line compression on revisions because generally
>> speaking there's not much in common between them. zlib compression in
>> groupcompress will pick out common strings like committer names. Good
>> question though.
>>
>
> We do "line-by-line" delta compression in --dev6 because I found that
> this assumption was incorrect. We get 2:1 compression improvements by
> doing delta compression. Since the only difference for 'revisions'
> fields is the fact that --dev6 puts them in groups w/ delta compression.
I guess there is possible to extend bencode format to accept \n
character between fields. So bzr can use benefits of line-by-line
compression.
E.g. encoding dict as
d\n
9:committer10:John Smith\n
e
or something similar will do the trick?
Just a crazy idea if bencode is really fast choice.
More information about the bazaar
mailing list