[CHK/MERGE] CHKMap.iter_interesting_nodes() and fetch updates

Robert Collins robertc at robertcollins.net
Mon Dec 1 22:57:21 GMT 2008


On Mon, 2008-12-01 at 12:28 -0600, John Arbash Meinel wrote:
> I don't think it was blocked on you specifically. It was blocked on "one
> other core dev had reviewed it". I think Martin and Andrew were both
> rather busy with stuff for the 1.10 release. Thanks for looking at it
> and merging it.
> 
> By the way, are you working on this at all? I'd like to make sure we
> aren't bumping into each other on this.

Yes. I'm currently working on commit to make it have a safe, single-path
iter_changes based core.

> My plan is to work on getting map() and unmap() to be stable, and then
> to work on making it possible to use a hash. With a wide-fan-out hash we
> can easily do something like 1M keys in a depth 3 tree (root + 1
> internal + leaves). And I think a lot of operations are going to be more
> critical by depth.

I think its important we do a bench test with a hash'd key. I'm not
convinced that it will be better or worse (though there is experimental
evidence that its better - which is another reason to test it). For
length 2 key-tuples, I suggest that the mapping be (hash(key[0]),
hash(key[1])), to give locality of reference.

> I was also thinking that for the (parent_id,basename) => file_id map we
> could actually hash the parent_id to be the key. That would also give us
> good fan out and still retain our locality per basename.

Ah yes, exactly what I just said above :).

> Oh, and another very helpful thing there would be to pull out the common
> prefix in leaf (and internal) nodes. (path_id, basename) has a lot of
> redundancy. I think I calculated you could fit 2x the number of nodes
> per leaf if you just pulled out the common prefix.

Yup, definitely worth doing. One thing to rememer is that nodes are
always defined bottom-up, so when you pull out such things, be sure not
to make them start depending on a top-down provision of data.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081202/95bbaa34/attachment.pgp 


More information about the bazaar mailing list