How to handle extracting lots of Inventories

Wed Oct 8 20:55:52 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> I've been looking at doing "bzr reconcile" and trying to figure out why
> it is so slow. I won't say I've profiled it everywhere, but at least for
> bzr.dev, when trying to "_generate_text_index" the time is primarily
> spent in xml8.unpack_inventory.
> 

To give specific numbers, to do "_generate_text_inventory" takes 715s on
my repository. Of that time, 659s is spent in 'repo.revision_trees()'.
I've also tweaked the caching parameters so that out of 47k inventory
lookups, only 346 of them needed to be extracted a second time. (Of
course, it uses 300+MB just for a mostly bzr repository.)

So one possibility would be to have a way to mutate an in-memory
Inventory based on a delta (extracted from a serialized delta) without
having to rebuild the whole thing from scratch. Such as a copy-on-write
sort of sharing.

Also, the "page-based" inventory that Robert is working on might get
this implicitly, if he does end up caching the InventoryEntry objects,
and they get shared between Inventory classes.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjtEEgACgkQJdeBCYSNAAMNZACfb4itUrI0LDOaXoYeJQqQfzAc
fJgAoMF/Zn2N0ihbbNL2kfEN6IngutFC
=UqF9
-----END PGP SIGNATURE-----