fast delta generation in brisbane-core - advice/direction needed
Ian Clatworthy
ian.clatworthy at internode.on.net
Mon Mar 2 10:08:31 GMT 2009
I'm still getting my head around how the brisbane-core
branch serialises inventories so forgive me if I'm missing
something fundamental here.
The key to making log -v and log DIR/FILES fast is low-cost
delta generation. Basically, I need an efficient way of
implementing something like:
repo.get_tree_delta_for_revision_id()
Note the the TreeDelta returned doesn't need "unchanged"
populated - just the other stuff (adds, removes, renames, etc.).
It also doesn't need an arbitrary previous revision to compare
against - it's always the LH parent.
I was kind of hoping that we were storing inventory deltas
*directly* and generating inventories from them. If we were,
I'd then to able to hook in at a low layer and doing something
like
repo.chk_bytes.as_tree_delta()
and for log DIR/multiple-files:
repo.chk_bytes.as_tree_delta(specific_files)
It doesn't seem like that's going to be possible yet?
Whatever delta-ing we're doing seems to be at the storage layer
(as a space(/time?) optimisation) and lost by the time we
extract the text?
Reading through John's emails, we're certainly making great
progress in terms of:
* faster text lookup
* reduced storage size.
But I fear that delta generation will always be slow on large
trees like OOo if the algorithm remains:
1. get inventory
2. get previous inventory
3. calculate changes.
Thoughts?
Ian C.
More information about the bazaar
mailing list