Rev 4505: (robertc) Document how, in file:///home/pqm/archives/thelove/bzr/%2Btrunk/ Patch Queue Manager pqm at
Fri Jul 3 00:10:55 BST 2009

At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

revno: 4505 [merge]
revision-id: pqm at
parent: pqm at
parent: robertc at
committer: Patch Queue Manager <pqm at>
branch nick: +trunk
timestamp: Fri 2009-07-03 00:10:53 +0100
  (robertc) Document how,
  	why and issues with inventory deltas. (Robert Collins)
  doc/developers/inventory.txt   inventory.txt-20080103013957-opkrhxy6lmywmx4i-1
=== modified file 'bzrlib/'
--- a/bzrlib/	2009-07-01 10:50:24 +0000
+++ b/bzrlib/	2009-07-02 23:10:53 +0000
@@ -1088,6 +1088,9 @@
     def apply_delta(self, delta):
         """Apply a delta to this inventory.
+        See the inventory developers documentation for the theory behind
+        inventory deltas.
         :param delta: A list of changes to apply. After all the changes are
             applied the final inventory must be internally consistent, but it
             is ok to supply changes which, if only half-applied would have an
@@ -1576,6 +1579,9 @@
         """Create a new CHKInventory by applying inventory_delta to this one.
+        See the inventory developers documentation for the theory behind
+        inventory deltas.
         :param inventory_delta: The inventory delta to apply. See
             Inventory.apply_delta for details.
         :param new_revision_id: The revision id of the resulting CHKInventory.

=== modified file 'bzrlib/'
--- a/bzrlib/	2009-06-10 03:56:49 +0000
+++ b/bzrlib/	2009-07-02 07:22:27 +0000
@@ -523,6 +523,9 @@
         for commit which is not required to handle situations that do not arise
         outside of commit.
+        See the inventory developers documentation for the theory behind
+        inventory deltas.
         :param new_revid: The new revision id for the trees parent.
         :param delta: An inventory delta (see apply_inventory_delta) describing
             the changes from the current left most parent revision to new_revid.

=== modified file 'bzrlib/'
--- a/bzrlib/	2009-07-01 10:28:44 +0000
+++ b/bzrlib/	2009-07-02 23:10:53 +0000
@@ -1015,6 +1015,9 @@
                                parents, basis_inv=None, propagate_caches=False):
         """Add a new inventory expressed as a delta against another revision.
+        See the inventory developers documentation for the theory behind
+        inventory deltas.
         :param basis_revision_id: The inventory id the delta was created
             against. (This does not have to be a direct parent.)
         :param delta: The inventory delta (see Inventory.apply_delta for

=== modified file 'doc/developers/inventory.txt'
--- a/doc/developers/inventory.txt	2009-04-04 02:50:01 +0000
+++ b/doc/developers/inventory.txt	2009-07-02 07:22:27 +0000
@@ -498,3 +498,90 @@
 delta, the only other valid fields are OLDPATH and 'file-id'. PARENT_ID is ''
 when a delete has been recorded or when recording a new root entry.
+Delta consistency
+Inventory deltas and more broadly changes between trees are a significant part
+of bzr's core operations: they are key components in status, diff, commit,
+and merge (although merge uses tree transform, deltas contain the changes that
+are applied to the transform). Our ability to perform a given operation depends
+on us creating consistent deltas between trees. Inconsistent deltas lead to
+errors and bugs, or even just unexpected conflicts.
+An inventory delta is a transform to change an inventory A into another
+inventory B (in patch terms its a perfect patch). Sometimes, for instance in a
+regular commit, inventory B is known at the time we create the delta. Other
+times, B is not known because the user is requesting that some parts of the
+second inventory they have are masked out from consideration. When this happens
+we create a delta that when applied to A creates a B we haven't seen in total
+before. In this situation we need to ensure that B will be internally
+consistent. Deltas are unidirectional, a delta(A, B) creates B from A, but
+cannot be used to create A from B.
+Deltas are expressed as a list of (oldpath, newpath, fileid, entry) tuples. The
+fileid, entry elements are normative; the old and new paths are strong hints
+but not currently guaranteed to be accurate. (This is a shame and something we
+should tighten up). Deltas are required to list all removals explicitly -
+removing the parent of an entry doesn't remove the entry.
+Applying a delta to an inventory consists of:
+ - removing all fileids for which entry is None
+ - adding or replacing all other fileids
+ - detecting consistency errors
+An interesting aspect of delta inconsistencies is when we notice them:
+ - Silent errors which our application logic misses
+ - Visible errors we catch during application, so bad data isn't stored in
+   the system.
+The minimum safe level for our application logic would be to catch all errors
+during application. Making generation never generate inconsistent deltas is
+a seperate but necessary condition for robust code.
+An inconsistent delta is one which:
+ - after application to an inventory the inventory is an impossible state.
+ - has the same fileid, or oldpath(not-None), or newpath(not-None) multiple
+   times.
+ - has a fileid field different to the entry.fileid in the same item in the
+   delta.
+Forms of inventory inconsistency deltas can carry/cause:
+ - An entry newly introduced to a path without also removing or relocating any
+   existing entry at that path. (Duplicate paths)
+ - An entry whose parent id isn't present in the tree. (Missing parent).
+ - Having oldpath or newpath not be actual original path or resulting path.
+   (Wrong path)
+ - An entry whose parent is not a directory. (Under non-directory).
+ - An entry that is internally inconsistent.
+Known causes of inconsistency:
+ - A 'new' entry which the inventory already has - when this is a directory
+   even arbitrary file ids under the 'new' entry are more likely to collide on
+   paths.
+ - Removing a directory without recursively removing its children - causes
+   Missing parent.
+ - Recording a change to an entry without including all changed entries found
+   following its parents up to and includin the root - can cause duplicate
+   paths, missing parents, wrong path, under non-directory.
+Avoiding inconsistent deltas
+The simplest thing is to never create partial deltas, as it is trivial to
+be consistent when all data is examined every time. However users sometimes
+want to specify a subset of the changes in their tree when they do an operation
+which needs to create a delta - such as commit.
+We have a choice about handling user requests that can generate inconsistent
+deltas. We can alter or interpret the request in such a way that the delta will
+be consistent, but perhaps larger than the user had intended. Or we can
+identify problematic situations and abort, specifying to the user why we have
+aborted and likely things they can do to make their request generate a
+consistent delta.
+Currently we attempt to expand/interpret the request so that the user is not
+required to understand all the internal constraints of the system: if they
+request 'foo/bar' we automatically include foo. This works to an extent but on
+review we are creating inconsistent deltas by the way we do this. We need to
+avoid all the known causes of inconsistency in our delta creation logic.

More information about the bazaar-commits mailing list