[MERGE] Add an InventoryEntry cache for xml deserialization
Robert Collins
robertc at robertcollins.net
Thu Dec 11 19:40:54 GMT 2008
Robert Collins has voted approve.
Status is now: Approved
Comment:
> 1) First an assumption. For a given (file_id, revision_id) pair, the
> InventoryEntry must be the same.
This is only guaranteed true for the same model repo - E.g. tree roots
are different between plain and rich-root. Also there was a very short
lived bug back around 0.0.8 that misrecorded the exec bit for about 2
commits; but I think this is ignorable.
split-inv might benefit from a cache too; and for split-inv the source
string is a safe cache key (it is safe in xml inventories too).
> 5) A FIFO cache is a lot better for this reason, but we have to take
> more care when we actually fill the cache. This is because 90% of what
> is in the cache we want to re-use, the records come to us in "sorted"
> order. So once we start pushing off a couple of the records, we tend
to
> push out all of the interesting ones, and restart the cache.
This is usual for FIFO caches; though your fifo cache looked more like a
LRU cache implemented with a FIFO queue to me (because each use popped
and readded the item (unless I misread it)). A dual queue (e.g.
frequency/lru) can perform a lot better in terms of hit rate, though at
a computation cost.
> 8) One other really big win that I didn't do is to get rid of calling
> InventoryEntry.copy() for files and symlinks. This is safe to do if
the
> Inventories are treated as readonly, or if "apply_delta" would create
> new File records, but I'm not sure if we can trust Inventory users to
> not mess with the entries in our cache.
We can't :)
> It *is* a fairly large win (25s => 20s, or about 20% of remaining
time).
> So I would consider adding a flag somewhere to tell the deserializer
> that "I know I have to treat these as readonly".
I'd introduce immutable versions of the objects; CHKInventory wants this
too.
For details, see:
http://bundlebuggy.aaronbentley.com/project/bzr/request/%3C49404BFA.7070705%40arbash-meinel.com%3E
Project: Bazaar
More information about the bazaar
mailing list