RFC: iter_changes based commit & inventory

Robert Collins robertc at robertcollins.net
Tue Dec 2 05:41:29 GMT 2008

So, commit is inherently inventory based, because the abstract Tree does
not provide things like 'last_changed' that inventory contains.

However, the iter_changes based commit needs to retrieve the
last_changed field for a single entry (if a single entry is altered),
currently this causes a full inventory to be generated.

Aaron proposed an API a few weeks ago as part of a patch to expose
inventory entries on Tree; which I objected too - and I still do, even
though I now need something similar - 

Specifically I need to get the .revision attribute on all file_ids
returned by iter_changes in order to have the per-file parents

The options I see are:
 - a method for getting the 'revision' attribute on its own, on Tree.
 - a method to get an entire InventoryEntry, on Tree
 - to use the basis inventory from the repository for this data
 - update iter_changes (with deprecated yada yada) to expose this.
 - make DirstateRevisionTree.inventory be a dynamic object which
   supports partial-evaluation. 

Using the repository basis will require a few small IO's in a
split-inventory situation to obtain, where the data is already present
and parsed in a dirstate tree. So I think its less desirable than
getting the data from dirstate. Getting an entire inventory entry is
unneeded here - the revision field is the only(*) field missing from

getting just the revision attribute on it's own is not terribly
appealing to me.

Updating iter_changes would be ideal in some respects. WorkingTrees
could use CURRENT_REVISION as their value for this. But its not as
clearly defined as it might be - do they need to check for a
content/meta change before setting it? Or is it only ever meaningful

Updating iter_changes is probably the most work, due to its performance
sensitivity and optimised C versions.

Making DirstateRevisionTree.inventory be a dynamic proxy for
late-created InventoryEntry objects is relatively easy, but I'd be
concerned about triggering performance issues as a result of doing that
- there will be overhead on every access to the inventory.

I'm inclined to add a specific helper method for just the revision_id,
to be expedient.


