[MERGE] Review feedback on hash trie inventories, and describe radix tree inventories, plus some details on hash trie implementation.
Robert Collins
robertc at robertcollins.net
Tue Sep 2 06:04:21 BST 2008
On Thu, 2008-08-28 at 16:03 +0200, Vincent Ladeuil wrote:
> >>>>> "robert" == Robert Collins <robertc at robertcollins.net> writes:
>
> <snip/>
>
> robert> Canonical form isn't delta based at all;
>
> May be I mispoked then, what I meant is that the full
> representation is stored as fragments on disk somehow. Accessing
> all the fragments may have a cost. This cost can be reduced by
> putting all the fragments at the same place, eventually
> *duplicating* the same information. As long as the representation
> is canonical this should not be a problem.
This is what the CHK index is about- by storing fragments under their
CHK, we only store one copy for a given fragment. We *could* inside the
storage layer, do text deltas between similar fragments (and probably
will need to) - but there needs to be a very hard, and very clear line
between the layers. The inventory layer must-not-store-deltas, and most
not depend-on-deltas-for-efficient-operations.
> robert> I think the penny hasn't quite dropped all the way
> robert> into the slot :). Also note that we're not using a
> robert> hash partition, there are no delta representations at
> robert> all. Its history independent (which is one of the
> robert> goals) - anything that ever writes deltas isn't
> robert> history independent.
>
> Hmm. Just to be clear, in the example you gave:
>
> robert> Ok, so - we start at the root node, and deserialise
> robert> it. This gives us a list of child node CHK's, as
> robert> pointers in a tree. e.g.
>
> robert> '0' : HASH1
> robert> '1' : HASH2
> robert> '2' : HASH3
>
> robert> We then read the other root node, which might give us:
> robert> '0' : HASH1
> robert> '1' : HASH4
> robert> '2' : HASH5
>
> Is the tree associated with HASH0 stored once and shared between
> the two trees or stored for each tree ?
There isn't a HASH0 in my example. There is a subtree named
'0' (contains keys with the prefix '0'), with a pointer of HASH1.
Its logically stored once for each tree. If they are in the same
repository there will be a single physical copy of it.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080902/849a07ae/attachment.pgp
More information about the bazaar
mailing list