Telling if two trees are different
Robert Collins
robertc at robertcollins.net
Fri Oct 24 06:39:59 BST 2008
On Fri, 2008-10-24 at 13:10 +0900, Stephen J. Turnbull wrote:
>
> Use git.<duck />
>
> Seriously, this is precisely what git is designed for. Why not create
> a plugin that implements git's object index, as an auxiliary? It
> can't be that hard, git is such a simple thing. It's just that
> instead of pointing into the .git/objects store for their content,
> blobs would point at a file/revision pair in the bzr store. Trees,
> commits, and tags would be the same (and maybe for your application
> implementing commits and tags would be unnecessary).
A few points here - in git the content blobs are the encoded blobs, not
the content-only.
That is there is a header + content for anything in the database.
So for a file X, it can be in the database many times, with different
content pointers - the object pointer will say 'not equal' - even though
the file content is identical.
Secondly, the cost of translating a bzr inventory into any other
representation is _always_ much greater than that of doing
revtree1.iter_changes(revtree1).next()
Its greater because iter_changes may do less-than-size-of-tree
comparisons when the trees support it, but translating shapes requires:
- translate tree1
- translate tree2
- now compare the root CHK.
In the specific case of translating into a fully git-like store
including the file texts (I'll just call this a CHK store from now on)
its much larger than just the metadata, because you have to insert the
file text to find out the CHK, even though bzr knows the file's SHA1
from the inventory.
By contrast, that python code fragment will use the best facilities of
the underlying trees (its a multimethod). The trees in my repository
branch are similar in nature to gits directory storage, in that there
are multiple documents to represent a single tree's
list-of-files-and-dirs, compared to hg's manifests and [mainline] bzr's
inventories.
I haven't yet written the iter_changes optimiser for these trees, but it
will have similar work to do as git does to compare two trees (though I
hope it will have a better scaling factory due to balancing its own tree
rather than following directory boundaries.)
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081024/2fd3a468/attachment.pgp
More information about the bazaar
mailing list