[MERGE] iter-changes based commit

Robert Collins robertc at robertcollins.net
Thu Sep 6 02:36:46 BST 2007


On Tue, 2007-09-04 at 01:56 +1000, Ian Clatworthy wrote:
> The attached patch speeds up a commit of 6 new files by 15% on a Mozilla
> tree: 30.4 secs to 25.9 secs. It should become redundant in the next
> release or so when Robert's better iterator lands. But that code isn't
> ready and this code is.
> 
> Is it worth the benefit/risk of including this in 0.91?

Here's what I'd like to see:

+        # Note: A smart commit iterator is in the pipeline. Until it
+        # arrives, we special case initial commit and non-merge commits
+        # for performance.
+        if len(self.parents) <= 1:
+            self._populate_from_tree(specific_files)
+        else:
+            self._populate_from_inventory(specific_files)

One code path. There is no need to special case here using _iter_changes
as a broader 'iterate the tree efficiently including data the inventory
does not know about' tool. Do any tests fail when there is one code
path ?

This case:
+            # Skip unknowns unless strict mode
+            if versioned_flags == (False,False):
+                if self.strict:
+                    raise StrictCommitFailed()
+                else:
+                    continue

Add a comment:
                     # Could version the file just-in-time here.
Also, I think its going to lead to a bug as it stands in N-parent trees,
where a versioned in the 2nd parent tree but unversioned in this tree,
does not get reported at all but it probably should... and this is one
of the things iter_changes isn't great at as an interface so we should
tweak it - really we want a vector of all the parent versioned flags.
(This is what my commit-ready iterator does and why it may be easier to
work with). 


Test wise, I have a few pet corner cases I don't know if we test today:

basis:
/foo foo-id
/foo/gam gam-id
/foo/quux quux-id

current:
/bar foo-id (change path, remove child)
/bar/gam gam-id (unchanged)
/gam quux-id (change the name)

'commit foo' or 'commit bar' should both record new versions of foo-id
and quux-id at /bar and /gam and the existing version of gam-id
at /bar/gam

'commit gam' should record a new version of quux-id at /gam and the
existing versions of foo-id and gam-id at /bar and /bar/gam

'commit foo/gam' or 'commit bar/gam' should error when allow_unchanged
is False.

with an extra parent:
original tree:
/bar foo-id at rev1

basis:
/foo foo-id at rev-2

parent-2:
/foo foo-id at rev-3

current:
/foo foo-id

Committing foo records foo-id at rev-4 even though no changes were made
against any parent.

(see my commit builder tests that went in a couple of days back for low
level tests that this works as designed). They include the full setup -
make a tree, add the path and commit, then branch and do a change,
commit, make the same change commit, then merge both and commit.



I think that if you were to fix the two external bugs to my iterator it
would fit well with this branch and help make it mergable.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070906/53f15dfd/attachment.pgp 


More information about the bazaar mailing list