sydney mini-sprint, kickstarting 0.16, roadmap for 0.16

John Arbash Meinel john at arbash-meinel.com
Tue Mar 27 14:43:07 BST 2007


Aaron Bentley wrote:
> Robert Collins wrote:
>> hpss first-pull optimisation
>> hpss first-push optimisation
> 
> In terms of hpss optimisation, it might be worth looking at Monotone's
> netsync protocol.  Apparently, they're able to use Merkle Tries to
> determine the missing revisions without sending a lot of revisions.  I
> can't say much more, as I don't know that math very well.
> 
> Aaron


I think it makes the most sense for them, because they are already
working on sha hashes for revision handles. And while we could hash our
revision_ids, you then end up sending a lot of data that just contains
hints about the data, rather than sending the data you actually care about.

Mercurial uses "branch-points" based on what "heads" are currently
present in a repository:
http://www.selenic.com/mercurial/wiki/index.cgi/WireProtocol

I know Robert and I have discussed it in the past. And there are a
couple of bits to consider

1) You don't have to be perfectly accurate. Especially with a lot of
latency versus reasonable bandwidth, sending 1% of duplicate revisions
is not usually a large cost, versus the cost of enough iterations to
make sure you've gotten the exact list.

2) We really want to leverage the ancestry aspect, I also think we can
have a dialog between client and server. Where the client could say "I'm
interested in what you have, and here is a sampling of what I have". And
the server can use that to give ideas of what to request.

But certainly I think we should at least look over all of the other
synchronization protocols, to make sure we don't miss anything obvious.

John
=:->



More information about the bazaar mailing list