fast delta generation in brisbane-core - advice/direction needed

Ian Clatworthy ian.clatworthy at internode.on.net
Mon Mar 2 10:08:31 GMT 2009


I'm still getting my head around how the brisbane-core
branch serialises inventories so forgive me if I'm missing
something fundamental here.

The key to making log -v and log DIR/FILES fast is low-cost
delta generation. Basically, I need an efficient way of
implementing something like:

  repo.get_tree_delta_for_revision_id()

Note the the TreeDelta returned doesn't need "unchanged"
populated - just the other stuff (adds, removes, renames, etc.).
It also doesn't need an arbitrary previous revision to compare
against - it's always the LH parent.

I was kind of hoping that we were storing inventory deltas
*directly* and generating inventories from them. If we were,
I'd then to able to hook in at a low layer and doing something
like

  repo.chk_bytes.as_tree_delta()

and for log DIR/multiple-files:

  repo.chk_bytes.as_tree_delta(specific_files)

It doesn't seem like that's going to be possible yet?
Whatever delta-ing we're doing seems to be at the storage layer
(as a space(/time?) optimisation) and lost by the time we
extract the text?

Reading through John's emails, we're certainly making great
progress in terms of:

* faster text lookup
* reduced storage size.

But I fear that delta generation will always be slow on large
trees like OOo if the algorithm remains:

1. get inventory
2. get previous inventory
3. calculate changes.

Thoughts?

Ian C.



More information about the bazaar mailing list