[MERGE][RFC] Add simple revision serializer based on RIO.

John Arbash Meinel john at arbash-meinel.com
Mon May 11 16:00:06 BST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> John Arbash Meinel пишет:
>> Martin Pool wrote:
>>> 2009/5/11 Matt Nordhoff <mnordhoff at mattnordhoff.com>:
>>>> Martin Pool wrote:
>>>> [snip]
>>>>
>>>>> However, before moving to RIO for future formats (and I say this
>>>>> having added the code) I would think hard about whether it should use
>>>>> bencode instead, which has the advantage of being able to represent
>>>>> somewhat more complex nesting (like dicts inside dicts) without
>>>>> needing a separate layer of encoding on top.  Revisions are pretty
>>>>> simple but even there it may be useful.  I'm not sure about the
>>>>> relative performance.
>>>> If I understand correctly, RIO is line-based but bencode is not. Is the
>>>> delta format still line-based? If so, using bencode would be more
>>>> difficult.
>>> We don't do line-by-line compression on revisions because generally
>>> speaking there's not much in common between them.  zlib compression in
>>> groupcompress will pick out common strings like committer names.  Good
>>> question though.
>>>
>>
>> We do "line-by-line" delta compression in --dev6 because I found that
>> this assumption was incorrect. We get 2:1 compression improvements by
>> doing delta compression. Since the only difference for 'revisions'
>> fields is the fact that --dev6 puts them in groups w/ delta compression.
> 
> I guess there is possible to extend bencode format to accept \n
> character between fields. So bzr can use benefits of line-by-line
> compression.
> 
> E.g. encoding dict as
> 
> d\n
> 9:committer10:John Smith\n
> e
> 
> or something similar will do the trick?
> 
> Just a crazy idea if bencode is really fast choice.

Sorry, I put "line-by-line" in quotes to hint that it was no longer
line-based. --dev6 uses a sliding window algorithm to do sub-line deltas.

The pure-python implementation is still line based, but I'm not
specifically concerned about that.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkoIPXUACgkQJdeBCYSNAAOmigCgmRORqQbD2GjFGILQN6h1buQL
3NsAoK2p96D0CUoZ9NFPNtH0gQ+9R/Dh
=yVfC
-----END PGP SIGNATURE-----



More information about the bazaar mailing list