[MERGE] Review feedback on hash trie inventories, and describe radix tree inventories, plus some details on hash trie implementation.
Vincent Ladeuil
v.ladeuil+lp at free.fr
Wed Aug 27 09:27:59 BST 2008
>>>>> "robert" == Robert Collins <robertc at robertcollins.net> writes:
robert> Thanks Vila, the reward for a review, is another review.
Perfect, the last night brought more concerns on the subject,
I'll try to articulate them as best as I can.
Disclaimer:
I have ideas and experiences (at least 2 significant
ones) in implementing hash-trees (for lack of a better name) as a
way to quickly an efficiently obtain differences between
trees.
Yet, I still lack knowledge about bzr internals to say:
"this is how we should implement it". So I appreciate any
explanation while I catch up on that internals knowledge.
There is currently a gap between what I understand from your
description and my mental model on how hash-trees work.
I think one way to fill that gap is to review. That may mean that
I should vote 'comment' instead of 'approve', but since we are
talking about design here this may be counter-productive for both
of us.
End of disclaimer.
robert> I've not down an incremental delta - a full reread
robert> may catch other quirks, and more review is good for
robert> this sort of thing..
No problem with that, I read bazaar-commits at lists.ubuntu.com for
incremental diffs :)
<snip/>
robert> --- doc/developers/inventory.txt 2008-01-03 01:40:16 +0000
robert> +++ doc/developers/inventory.txt 2008-08-27 00:42:39 +0000
<snip/>
robert> In memory inventories
robert> =====================
robert> @@ -39,9 +41,411 @@
robert> All the xml serialized forms write to and read from a single byte string, whose
robert> hash is then the inventory validator for the commit object.
robert> -journalled
robert> -----------
robert> -
robert> -The in development journalled inventory serializer generates a single byte
robert> -string during serialization, but may require many byte strings to deserialize,
robert> -and these are discovered recursively.
robert> +
robert> +Serialization scaling and future designs
robert> +========================================
robert> +
robert> +Overall efficiency and scaling is constrained by the bottom level structure
robert> +that an inventory is stored as. We have a number of goals we want to achieve:
robert> +
robert> + 1. Allow commit to write less than the full tree's data in to the repository
robert> + in the general case.
I'm not sure we can respect that AND get an efficient (speed
wise) implementation of tree differences.
IME some representation of the full tree is needed when it comes
to calculate the differences. There may be solutions to work with
partial representation (or build/load them on demand), but I can't
imagine them without filling the gap I mentioned earlier.
Said otherwise: I'm sure a huge performance gain can be achieved,
do we really need to address the space gain too ? On the other
hand working with less data is often a good way to achieve
performance gains...
robert> + 2. Allow the data that is written to be calculated without examining every
robert> + versioned path in the tree.
robert> + 3. Generate the exact same representation for a given inventory regardless of
robert> + the amount of history available.
No problems with these two.
Well, as long as the canonical representation is adequate :)
robert> + 4. Allow in memory deltas to be generated directly from the serialised form
robert> + without upcasting to a full in-memory representation or examining every
robert> + path in the tree. Ideally the work performed will be proportional to the
robert> + amount of changes between the trees being compared.
Ok for the 'without examining every path in the tree' and 'the
work performed will be proportional to the amount of changes
between the trees being compared'.
Some doubts about the 'without upcasting to a full in-memory
representation'. But I can imagine on-demand loading of the full
representation (which also means no loading for the common parts).
robert> + 5. Allow fetch to determine the file texts that need to be pulled to ensure
robert> + that the entire tree can be reconstructed without having to probe every
robert> + path in the tree.
robert> + 6. Allow bzr map paths
Is it 'bzr to map' ?
robert> to file ids without reading the entire serialised
robert> + form. This is something that is used by commands such as merge PATH and
robert> + diff -r X PATH.
robert> + 7. Let bzr map file ids to paths without reading the entire serialised form.
robert> + This is used by commands that are presenting output to the user such as
robert> + loggerhed, bzr-search, log FILENAME.
Lost in the gap.
robert> + 8. We want a strong validator for inventories which is cheap to generate.
robert> + Specifically we should be able to create the generator for a new commit
robert> + without processing all the data of the basis commit.
robert> + 9. Testaments generation is currently size(tree), we would like to create a
robert> + new testament standard which requires less work so that signed commits
robert> + are not significantly slower than regular commits.
Half-lost in the gap but doesn't sound too hard to get.
robert> +
robert> +
robert> +We have current performance and memory bugs in log -v, merge, commit, diff -r,
robert> +loggerhead and status -r which can be addresessed by an inventory system
s/addresessed/addressed/ or s/addresessed/assessed/
robert> +meeting these goals.
robert> +
<snip/>
robert> +Hash bucket based inventories
robert> +=============================
robert> +
robert> +Overview
robert> +--------
robert> +
robert> +We store two maps - fileid:inventory_entry and path:fileid, in a stable
robert> +hash trie, stored in densly packed fragments.. We pack keys into the map
s/.././
<snip/>
robert> +Canonical form
robert> +--------------
robert> +
robert> +There are three fragment types for the canonical form:
robert> +
robert> +root_node: (Perhaps this should be inlined into the revision object).
robert> +HASH_INVENTORY_SIGNATURE
robert> +path_map: CHK to root of path to id map
You don't explain what CHK means. Is is just CHecK as in a hash
addressing the path_map ?
robert> +content_map: CHK to root of id to entry map
robert> +
robert> +map_node: INTERNAL_NODE or LEAF_NODE
You later define leaf_node not LEAF_NODE. I'm used to lowercase
identifiers for on-terminals and uppercase identifiers to
terminals.
Also, indenting the continuation lines in the right hand size
(RHS) of the rules will make them easier to read.
robert> +INTERNAL_NODE:
robert> +INTERNAL_NODE_SIGNATURE
robert> +hash_prefix: PREFIX
robert> +prefix_width: INT
robert> +PREFIX CHK TYPE SIZE
robert> +PREFIX CHK TYPE SIZE ...
Err, I'm lost here. hash_prefix and prefix_width are not
referenced and hash_prefix is defined twice.
robert> +
robert> +(Where TYPE is I for internal or L for leaf).
robert> +
robert> +leaf_node:
robert> +LEAF_NODE_SIGNATURE
robert> +hash_prefix: PREFIX
robert> +HASH\x00KEY\x00 VALUE
\x00 is an implementation detail, can we get rid of it for now ?
<snip/>
robert> +Delta
robert> +-----
robert> +
robert> +To generate a delta between two inventories, we first generate a list of
robert> +altered fileids, and then recursively look up their parents to generate their
robert> +old and new file paths.
robert> +
robert> +To generate the list of altered file ids, we do an entry by entry comparison of
robert> +the full contents of every leaf node that the two inventories do not have in
robert> +common. To do this, we start at the root node, and follow every CHK pointer
robert> +that is only in one tree. We can then bring in all the value from the leaf
robert> +nodes and do a set difference to get the altered ones, which we would then
robert> +parse.
This sounds a lot like what I'm after but I don't yet get the
needed understanding to validate it.
robert> +
robert> +Radix tree based inventories
robert> +============================
robert> +
<snip/>
As summarized below, I lack the knowledge to validate the
differences between hash tries and radix trees, nothing sounds
wrong while reading though.
robert> +Hash Trie details
robert> +=================
robert> +
robert> +The canonical form for a hash trie is a tree of internal nodes leading down to
robert> +leaf nodes, with no node exceeding some threshold size, and every node
robert> +containing as much content as it can, but no leaf node containing less than
robert> +its lower size threshold. (In the event that an imbalance in the hash function
robert> +causes a tree where an internal node is needed, but any prefix generates a
robert> +child with less than the lower threshold, the smallest prefix should be taken).
robert> +An internal node holds some number of key prefixs, all with the same bit-width.
s/prefixs/prefixes/ ?
<snip/>
robert> +Insertion
robert> +---------
robert> +
robert> +1. Hash the entry, and insert the entry in the leaf node with a matching
robert> + prefix, creating that node and linking it from the internal node containing
robert> + that prefix if there is no appropriate leaf node.
robert> +1. Starting at the highest node altered, for all altered nodes, check if it has
robert> + transitioned across either size boundary - 0 < min_size < max_size. If it
robert> + has not, proceed to update the CHK pointers.
robert> +1. If it increased above min_size, check the node above to see if it can be
robert> + more densely packed. To be below the min_size the node's parent must
robert> + have hit the max size constraint and been forced to split even though this
robert> + child did not have enough content to support a min_size node - so the prefix
robert> + chosen in the parent may be been shorter than desirable and we may now be able
s/be been/be/
robert> + to more densely pack the parent by splitting the child nodes more. So if the
robert> + parent node can support a deeper prefix without hitting max_size, and the
robert> + count of under min_size nodes cannot be reduced, the parent should be given
robert> + a deeper prefix.
robert> +1. If it increased above max_size, shrink the prefix width used to split out
robert> + new nodes until the the node is below max_size (unless the prefix
s/the the/the/
robert> + width is already 1 - the minimum).
robert> + To shrink the prefix of an internal node, create new internal nodes for each
robert> + new prefix, and populate them with the content of the nodes which were
robert> + formerly linked. (This will normally bubble down due to keeping densely
robert> + packed nodes).
robert> + To shrink the prefix of a leaf node, create an internal node with the same
robert> + prefix, then choose a width for the internal node such that the contents
robert> + of the leaf all fit into new leaves obeying the min_size and max_size rules.
robert> + The largest prefix possible should be chosen, to obey the
robert> + higher-nodes-are-denser rule. That rule also gives room in leaf nodes for
robert> + growth without affecting the parent node packing.
robert> +1. Update the CHK pointers - serialise every altered node to generate a CHK,
robert> + and update the CHK placeholder in the nodes parent; then reserialise the
robert> + parent. CHK pointer propogation can be done lazily when many updates are
robert> + expected.
robert> +
Hmmm. Mixed feelings as a reviewer here. On the one hand you
provide a lot of details to guide the implementation which is
good. On the other, I unable to validate such details without at
least a prototype implementation (I confess I have never
implemented such a scheme even if most of the concepts ring
bells).
robert> +Multiple versions of nodes for the same PREFIX and internal prefix width should
robert> +compress well for the same tree.
Conclusion
----------
At that point, I'm not able to make a informed judgement and I
don't know if it's because I lack knowledge about our current
implementation or because I can't predict if your proposed
implementation will do better (obviously or is inclusive here).
I vote BB:tweak again since some points need to be made clearer
but the whole explanation is better.
Vincent
More information about the bazaar
mailing list