new style inventories continued - now is a great point to jump in and hack

Robert Collins robertc at robertcollins.net
Thu Oct 9 06:25:03 BST 2008


So I've been continuing to explore the world of inventories that are in
multiple pieces.

I've made a major step forward, and committed and pushed it (to
http://people.ubuntu.com/~robertc/baz2.0/repository). This step has a
functional (passes all tests) repository format based on a persistent
dictionary stored in the repository.

This is a major step because it meant finding and fixing (hopefully) all
the code that accessed an inventory from a repository after unlocking
the repository.

What it is
----------

Proof of concept.
Basis for future work.
Something that can be experimented with.

What it isn't
-------------

Ready for merging.
Fast.


Details
-------

I made fetch deserialise and serialise inventories when fetching into a
repository with split inventories. This works but has lots of room for
improvement.

Similarly the find_text_key_reference and related functions use the full
inventory object. Again, this works but is massively tunable.

Each inventory is now a small object, with just the revision id, root id
and the CHK of the file_id_to_inventory_entry CHKMap. (See
chk_serializer and inventory's CHKInventory for details).


What you can do
---------------

One thing would be to just give it a spin - finding issues is good. The
format to play with is either --development3, or --development3-subtree.
I'm not treating these as stable until they actually merge to bzr.dev,
so don't migrate your own work areas to them yet :).

Another possibility is to profile an operation you'd like to be fast
with these inventories, and then work on that code path to improve it.
(adding caches, extending the fragmentation to be closer to the ideals
put forward in the doc/developers/inventory.txt file, implementing
concurrent-traversal of maps etc).

Just reading the code and commenting on it would be useful too, as while
I will endeavour to pluck useful bits out for early merging, there are a
lot of unknowns about this code, and its hard to say to a reviewer that
it should be one way or another until the testing and profiling process
is completed.

A full task list for this branch is roughly
before merging
--------------

test the parameters I laid out in my prior email about inventories, to
determine what to pick

Ensure all the new code is thoroughly tested and works as a default
format. (While I practise TDD, it doesn't always fit well).

Teach check to check the components of the inventory.


after merging
-------------

Start updating inventories during commit rather than serialising afresh.

Get historical diffs to start working via two-step delta combination.
(combine(delta(basis, working), delta(historical1, historicalbasis))).


For this branch, please send patches as merge requests made against the
branch (not bzr.dev :)) to the list CC'd me, with a header that won't
match BB. I'll consider requests for merges into this branch top
priority to review and apply.

Cheers,
Rob



-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081009/5c364d7f/attachment-0001.pgp 


More information about the bazaar mailing list