[RFC] Should we rewrite nested-trees or our formats or punt?

Michael Hudson michael.hudson at canonical.com
Wed Mar 25 22:22:53 GMT 2009


John Arbash Meinel wrote:
> Aaron Bentley wrote:
>> Hi all,
> 
> 
> ...
> 
>> - in particular, code paths use Tree.iter_references to determine which
>> subtrees need to be examined, which must walk the tree.
> 
>> This means that if we land brisbane-core today, it will be slower than
>> it should be.
> 
> 
>> Option 1:
>> Store the data harmoniously with our code paths.
> 
>> We can store the data about nested trees in an extra CHK tree in
>> brisbane-core, and we can add special fields to dirstate.  The extra
>> data will typically be very small.
> 
>> This feels too late in the release process.
> 
> 
> ...
> 
>> Option 3:
>> Lie about subtree support
> 
>> We can land a brisbane-core format that supports subtrees but claims not
>>  to.  Or alternatively, we can have a config option to enable subtree
>> support (i.e. in locations.conf).
> 
> 
> 
> I feel like we should do a mix of 3 & 1. We can make sure the serializer
> format has the ability to write "tree-reference" and just refuses to do
> so for now. And then plan on upgrading the disk formats to include the
> extra index for tree-references. (Note that for dirstate, we still don't
> do a partial walk, so we could just keep the index as memory-only, and
> have it generated during _parse_blocks_if_needed().)
> 
> That way the conversion process is mostly trivial. It might need to walk
> the .iix and change the top-level reference pages, but that is changing
> <500bytes of data for each revision. And since we know we weren't using
> tree references anyway, we don't have walk *any* of the chk pages. So
> whatever we need to put in for an empty chk_map reference, we know that
> we just write the same value to every node.

I think I agree with John, given that I don't 100% understand the
details, in that it would be a crying shame for the solution to this to
involve reserializing everything when nested trees go gold.

Cheers,
mwh



More information about the bazaar mailing list