[RFC] Should we rewrite nested-trees or our formats or punt?

John Arbash Meinel john at arbash-meinel.com
Wed Mar 25 21:58:25 GMT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> Hi all,
> 

...

> - in particular, code paths use Tree.iter_references to determine which
> subtrees need to be examined, which must walk the tree.
> 
> This means that if we land brisbane-core today, it will be slower than
> it should be.
> 
> 
> Option 1:
> Store the data harmoniously with our code paths.
> 
> We can store the data about nested trees in an extra CHK tree in
> brisbane-core, and we can add special fields to dirstate.  The extra
> data will typically be very small.
> 
> This feels too late in the release process.
> 

...

> Option 3:
> Lie about subtree support
> 
> We can land a brisbane-core format that supports subtrees but claims not
>  to.  Or alternatively, we can have a config option to enable subtree
> support (i.e. in locations.conf).
> 


I feel like we should do a mix of 3 & 1. We can make sure the serializer
format has the ability to write "tree-reference" and just refuses to do
so for now. And then plan on upgrading the disk formats to include the
extra index for tree-references. (Note that for dirstate, we still don't
do a partial walk, so we could just keep the index as memory-only, and
have it generated during _parse_blocks_if_needed().)

That way the conversion process is mostly trivial. It might need to walk
the .iix and change the top-level reference pages, but that is changing
<500bytes of data for each revision. And since we know we weren't using
tree references anyway, we don't have walk *any* of the chk pages. So
whatever we need to put in for an empty chk_map reference, we know that
we just write the same value to every node.

I would expect that upgrade time to be something like <1min for 100k
revisions. The main reason we want most things in brisbane-core is
because converting from XML => CHK costs hours. (Converting the CHK page
layout would be expensive, but if the page-layout is fixed, just
changing the root nodes is cheap.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknKqQEACgkQJdeBCYSNAAMnWwCePfOjEa9u/GrGTaI8g2Hkzdeb
uegAoIoqfcY/yKc5Q3av+6vZW41qU8Vl
=rA4b
-----END PGP SIGNATURE-----



More information about the bazaar mailing list