[MERGE] Review feedback on hash trie inventories, and describe radix tree inventories, plus some details on hash trie implementation.

Tue Sep 2 16:54:40 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Meta-note, I have had no bb:COMMAND message for this patch, so its
> outstanding.

I'm pretty sure Vila gave you BB:tweak. Probably the reason it failed was that
it was given further down in the mailing thread, and BB only respects direct
response to the original submission email.

I'm perfectly happy with you bringing this in, as it is good documentation to
have around.  So:

BB:approve

Or tweak depending on if there were typos, etc that vila was requesting.

That said, I still think we need to discuss the ideas presented, and we may
need to refine the document as a result. But that doesn't mean we can't merge
what you've written.

> 
> On Thu, 2008-08-28 at 11:29 -0500, John Arbash Meinel wrote:
> 
>> One thing which concerns me with Robert's scheme, is just how much
>> data gets
>> written to disk on a commit. If you change an entry in the second
>> level, you
>> obviously have to write out that page, and then the root page.
>> So if we use 4k pages, that means every commit is 8k.
> 
> Right. However, only one line in each of those pages is changed. So its
> 2 lines of unique text, and 7.9K of duplicate text.
> 

Sure, I'm going back to your original statement (paraphrased) "if we need to
do delta compression, it needs to be evaluated in terms of the whole stack,
because it will effect how everything works together".

I agree that you can do a hash trie on top of delta compression and get
something that is efficient with space. How is that going to effect extraction
times, number of round trip requests, etc. We now have one layer than needs a
round-trip for every effective hash page, but then we are adding another layer
than needs something to extract a full page.

I think we should also be asking if we could do a hash trie *without* delta
compression, say by shrinking the page size? Or some other representation
(something akin to your idea of journaling).

If we are compressing at the level of hash pages, do we completely lose the
ability to combine things between the pages? Right now we write O(N/200 +
changes). Because 199/200 times (obviously the real average is less) we get to
write a delta. Which is a single record.

If I understand your system, with 1 root update and 1 sub-page update, you end
up with 2 deltas you need to lookup, rather than just one. And because of the
hash trie layering, you don't know about the second one until you get there,
so it can't be parallelized.

...

>> How about an example lookup:
>>
>> Find the InventoryEntry for file_id = "j-file-id" in revision
>> "a-rev-id"
>>
>> 1) compute hash_file_id = hash('j-file-id')
>> 2) Read the "root node" for "a-rev-id", to get the CHK for the
>> content_map
>> 3) Read the root page of the content map
>> 4) compare hash_file_id to root_page.hash_prefix (which for the root
>> is
>> probably empty?)
> 
> Yes, it is - thats explicit in the docs I believe.

It may be, I mostly felt like walking through the steps in my own mind.
Further, it isn't 100% true because if we are adding page-deltas then there
are more steps involved. So it is true *at this layer*, but I think you were
write that we need to be discussing the stack, and not just one layer.

For example, we *might* consider not doing a page delta for root nodes. As we
know they get accessed a lot, and we don't want to pay N accesses (or index
lookups, etc.)

...

> I think you could do hash(path) or path, but path will give 'ls' whereas
> hash(path) won't. Or we need the storage of a directory entry to
> explicitly include its children. I'm leaning to path keys as the best
> answer.

That is my leaning as well, though I will mention that the paper you gave was
not path compressed. So intermediate nodes only consumed a single byte, which
is probably bad for directory listings. I'll also note that the infinite
hash-page approach performed worse on *long* random length strings than some
of the other approaches. It was mostly optimized for when the prefix of the
string is 99% consumed by the trie, such that you had < size(pointer) left to
put in the buckets.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIvWHAJdeBCYSNAAMRAgvdAKDNix99B06i9L2TVv55KfMmlFsRegCgqh7Q
lUbbnqXCW1SU6pwUQrPPdTM=
=flnv
-----END PGP SIGNATURE-----