things-to-do-in-chk-repository
Robert Collins
robertc at robertcollins.net
Tue Nov 11 06:18:49 GMT 2008
On Mon, 2008-11-10 at 20:42 -0500, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > So here are some 'hot' topics in this branch:
> > - write a RevisionTree.iter_changes(RevisionTree) optimiser
> > that picks up on the type of the inventory to fast-path
> > deltas using the CHKInventory guts. (poolie is looking at this)
>
> For RevisionTree-to-RevisionTree, a richer API that included sha1sum
> would make a lot of sense.
Indeed; I think an interim step would be to do
tree.get_file_sha1sum(file_id, path), which should be fast on dirstate,
and fast in the split inventories (because that entry will be in the
cache from generating the iter_changes output). It may be sufficient to
do just this in fact, for all but the largest merges.
> > - get 'st -r -2' to do inventory delta composition - that is to do
> > wt.iter_changes(basis_tree) and RevisionTree(-1).iter_changes(
> > RevisionTree(-2)), and combine the results. Combined with the
> > optimiser for (RT,RT) above this should lead to very fast diffs
> > with deep history (both because we don't need to generate a full
> > inventory at any point, and because the repository can be optimised
> > too.
> >
> > I think the delta composition is an important thing to work on, because
> > it will be difficult to tell if the design is successful until that is
> > working.
>
> I think the kind of delta composition we need to do is dead simple:
> for WT -> BASIS, generate an inventory entry* of each modified file.
> For BASIS -> REVISION_TREE, generate an inventory entry of each modified
> file. For file-ids in REVISION_TREE that are missing from WT, copy them
> from BASIS. For file-ids in WT that are missing from REVISION_TREE,
> copy them from BASIS.
>
> Then it should be trivial to generate iter_changes-style ouput from the
> WT and REVISION_TREE inventory entries.
>
> * We don't need real inventory entries, but we'll want sha1sum so that
> we detect cases where REVISION_TREE and WT have the same content, but
> BASIS is different.
I think a real inventory entry is the simplest thing to do; while making
objects isn't the fastest thing around, because we're dealing with
size(changes) its acceptable (compared to the current system!).
Thanks for analysing the logic in more detail.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081111/fa9a07f2/attachment.pgp
More information about the bazaar
mailing list