Rev 2648: (Ian Clatworthy) Refactor commit to prepare for population by tree walking in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Mon Jul 23 06:38:18 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2648
revision-id: pqm at pqm.ubuntu.com-20070723053815-oihay9qovs508r3k
parent: pqm at pqm.ubuntu.com-20070722185402-a7ib57fehfwd03jy
parent: ian.clatworthy at internode.on.net-20070723040130-z5yh5aj0g4zwqhnl
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Mon 2007-07-23 06:38:15 +0100
message:
  (Ian Clatworthy) Refactor commit to prepare for population by tree walking
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
  bzrlib/memorytree.py           memorytree.py-20060906023413-4wlkalbdpsxi2r4y-1
  bzrlib/tests/workingtree_implementations/test_commit.py test_commit.py-20060421013633-1610ec2331c8190f
    ------------------------------------------------------------
    revno: 2647.1.1
    merged: ian.clatworthy at internode.on.net-20070723040130-z5yh5aj0g4zwqhnl
    parent: pqm at pqm.ubuntu.com-20070722185402-a7ib57fehfwd03jy
    parent: ian.clatworthy at internode.on.net-20070723035711-a4y0zt5vvbaiuxty
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: ianc-integration
    timestamp: Mon 2007-07-23 14:01:30 +1000
    message:
      (Ian Clatworthy) Refactor commit to prepare for population by tree walking
    ------------------------------------------------------------
    revno: 2564.2.7
    merged: ian.clatworthy at internode.on.net-20070723035711-a4y0zt5vvbaiuxty
    parent: ian.clatworthy at internode.on.net-20070718055129-u6tsvsqrd4pcuikw
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Mon 2007-07-23 13:57:11 +1000
    message:
      Clean-up comments and make use of self.builder.record_root_entry more explicit
    ------------------------------------------------------------
    revno: 2564.2.6
    merged: ian.clatworthy at internode.on.net-20070718055129-u6tsvsqrd4pcuikw
    parent: ian.clatworthy at internode.on.net-20070713065436-5faa0xi6phxxvq1x
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Wed 2007-07-18 15:51:29 +1000
    message:
      Incorporate feedback from abentley
    ------------------------------------------------------------
    revno: 2564.2.5
    merged: ian.clatworthy at internode.on.net-20070713065436-5faa0xi6phxxvq1x
    parent: ian.clatworthy at internode.on.net-20070713062245-m3ws7khi4jyhgbg3
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Fri 2007-07-13 16:54:36 +1000
    message:
      Fix NEWS to reflect merge into 0.19 instead of 0.18
    ------------------------------------------------------------
    revno: 2564.2.4
    merged: ian.clatworthy at internode.on.net-20070713062245-m3ws7khi4jyhgbg3
    parent: ian.clatworthy at internode.on.net-20070713062159-kh1kde034wcrkygr
    parent: pqm at pqm.ubuntu.com-20070713060449-rydsxz28x12l2ksm
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Fri 2007-07-13 16:22:45 +1000
    message:
      Merge bzr.dev
    ------------------------------------------------------------
    revno: 2564.2.3
    merged: ian.clatworthy at internode.on.net-20070713062159-kh1kde034wcrkygr
    parent: ian.clatworthy at internode.on.net-20070713053639-m9c55376prkqzq3w
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Fri 2007-07-13 16:21:59 +1000
    message:
      more abentley feedback: use get_nested_tree and include file_id
    ------------------------------------------------------------
    revno: 2564.2.2
    merged: ian.clatworthy at internode.on.net-20070713053639-m9c55376prkqzq3w
    parent: ian.clatworthy at internode.on.net-20070629043501-zgm70dyx6ut3yerk
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Fri 2007-07-13 15:36:39 +1000
    message:
      incorporate feedback from abentley
    ------------------------------------------------------------
    revno: 2564.2.1
    merged: ian.clatworthy at internode.on.net-20070629043501-zgm70dyx6ut3yerk
    parent: pqm at pqm.ubuntu.com-20070629000000-dmkdjthna7njsccg
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.commit-prepare-populate-alternatives
    timestamp: Fri 2007-06-29 14:35:01 +1000
    message:
      refactor commit to support alternative population meothds
=== modified file 'NEWS'
--- a/NEWS	2007-07-22 18:54:02 +0000
+++ b/NEWS	2007-07-23 04:01:30 +0000
@@ -74,6 +74,9 @@
 
     * Annotate merge now works when there are local changes. (Aaron Bentley)
 
+    * Commit now only shows the progress in terms of directories instead of
+      entries. (Ian Clatworthy)
+
   LIBRARY API BREAKS:
 
     * Deprecated dictionary ``bzrlib.option.SHORT_OPTIONS`` removed.

=== modified file 'bzrlib/commit.py'
--- a/bzrlib/commit.py	2007-07-13 02:23:34 +0000
+++ b/bzrlib/commit.py	2007-07-23 03:57:11 +0000
@@ -256,11 +256,6 @@
             # Check that the working tree is up to date
             old_revno,new_revno = self._check_out_of_date_tree()
 
-            if strict:
-                # raise an exception as soon as we find a single unknown.
-                for unknown in self.work_tree.unknowns():
-                    raise StrictCommitFailed()
-                   
             if self.config is None:
                 self.config = self.branch.get_config()
 
@@ -297,7 +292,8 @@
                 raise errors.CannotCommitSelectedFileMerge(self.specific_files)
             
             # Collect the changes
-            self._emit_progress_set_stage("Collecting changes", show_entries=True)
+            self._set_progress_stage("Collecting changes",
+                    entries_title="Directory")
             self.builder = self.branch.get_commit_builder(self.parents,
                 self.config, timestamp, timezone, committer, revprops, rev_id)
             self._update_builder_with_changes()
@@ -307,7 +303,7 @@
             # ADHB 2006-08-08: If this is done, populate_new_inv should not add
             # weave lines, because nothing should be recorded until it is known
             # that commit will succeed.
-            self._emit_progress_set_stage("Saving data locally")
+            self._set_progress_stage("Saving data locally")
             self.builder.finish_inventory()
 
             # Prompt the user for a commit message if none provided
@@ -322,7 +318,7 @@
             # Upload revision data to the master.
             # this will propagate merged revisions too if needed.
             if self.bound_branch:
-                self._emit_progress_set_stage("Uploading data to master branch")
+                self._set_progress_stage("Uploading data to master branch")
                 self.master_branch.repository.fetch(self.branch.repository,
                                                     revision_id=self.rev_id)
                 # now the master has the revision data
@@ -335,7 +331,7 @@
             self.branch.set_last_revision_info(new_revno, self.rev_id)
 
             # Make the working tree up to date with the branch
-            self._emit_progress_set_stage("Updating the working tree")
+            self._set_progress_stage("Updating the working tree")
             rev_tree = self.builder.revision_tree()
             self.work_tree.set_parent_trees([(self.rev_id, rev_tree)])
             self.reporter.completed(new_revno, self.rev_id)
@@ -468,7 +464,7 @@
     def _process_hooks(self, old_revno, new_revno):
         """Process any registered commit hooks."""
         # Process the post commit hooks, if any
-        self._emit_progress_set_stage("Running post commit hooks")
+        self._set_progress_stage("Running post commit hooks")
         # old style commit hooks - should be deprecated ? (obsoleted in
         # 0.15)
         if self.config.post_commit() is not None:
@@ -594,24 +590,59 @@
 
         specific_files = self.specific_files
         mutter("Selecting files for commit with filter %s", specific_files)
-        work_inv = self.work_tree.inventory
-        assert work_inv.root is not None
-        self.pb_entries_total = len(work_inv)
 
         # Check and warn about old CommitBuilders
-        entries = work_inv.iter_entries()
         if not self.builder.record_root_entry:
             symbol_versioning.warn('CommitBuilders should support recording'
                 ' the root entry as of bzr 0.10.', DeprecationWarning, 
                 stacklevel=1)
             self.builder.new_inventory.add(self.basis_inv.root.copy())
-            entries.next()
-
+
+        # Build the new inventory
+        self._populate_from_inventory(specific_files)
+
+        # If specific files are selected, then all un-selected files must be
+        # recorded in their previous state. For more details, see
+        # https://lists.ubuntu.com/archives/bazaar/2007q3/028476.html.
+        if specific_files:
+            for path, new_ie in self.basis_inv.iter_entries():
+                if new_ie.file_id in self.builder.new_inventory:
+                    continue
+                if is_inside_any(specific_files, path):
+                    continue
+                ie = new_ie.copy()
+                ie.revision = None
+                self.builder.record_entry_contents(ie, self.parent_invs, path,
+                                                   self.basis_tree)
+
+        # Report what was deleted. We could skip this when no deletes are
+        # detected to gain a performance win, but it arguably serves as a
+        # 'safety check' by informing the user whenever anything disappears.
+        for path, ie in self.basis_inv.iter_entries():
+            if ie.file_id not in self.builder.new_inventory:
+                self.reporter.deleted(path)
+
+    def _populate_from_inventory(self, specific_files):
+        """Populate the CommitBuilder by walking the working tree inventory."""
+        if self.strict:
+            # raise an exception as soon as we find a single unknown.
+            for unknown in self.work_tree.unknowns():
+                raise StrictCommitFailed()
+               
         deleted_ids = []
         deleted_paths = set()
-        for path, new_ie in entries:
-            self._emit_progress_next_entry()
-            file_id = new_ie.file_id
+        work_inv = self.work_tree.inventory
+        assert work_inv.root is not None
+        entries = work_inv.iter_entries()
+        if not self.builder.record_root_entry:
+            entries.next()
+        for path, existing_ie in entries:
+            file_id = existing_ie.file_id
+            name = existing_ie.name
+            parent_id = existing_ie.parent_id
+            kind = existing_ie.kind
+            if kind == 'directory':
+                self._next_progress_entry()
 
             # Skip files that have been deleted from the working tree.
             # The deleted files/directories are also recorded so they
@@ -628,111 +659,116 @@
                     continue
             try:
                 kind = self.work_tree.kind(file_id)
+                # TODO: specific_files filtering before nested tree processing
                 if kind == 'tree-reference' and self.recursive == 'down':
-                    # nested tree: commit in it
-                    sub_tree = WorkingTree.open(self.work_tree.abspath(path))
-                    # FIXME: be more comprehensive here:
-                    # this works when both trees are in --trees repository,
-                    # but when both are bound to a different repository,
-                    # it fails; a better way of approaching this is to 
-                    # finally implement the explicit-caches approach design
-                    # a while back - RBC 20070306.
-                    if (sub_tree.branch.repository.bzrdir.root_transport.base
-                        ==
-                        self.work_tree.branch.repository.bzrdir.root_transport.base):
-                        sub_tree.branch.repository = \
-                            self.work_tree.branch.repository
-                    try:
-                        sub_tree.commit(message=None, revprops=self.revprops,
-                            recursive=self.recursive,
-                            message_callback=self.message_callback,
-                            timestamp=self.timestamp, timezone=self.timezone,
-                            committer=self.committer,
-                            allow_pointless=self.allow_pointless,
-                            strict=self.strict, verbose=self.verbose,
-                            local=self.local, reporter=self.reporter)
-                    except errors.PointlessCommit:
-                        pass
-                if kind != new_ie.kind:
-                    new_ie = inventory.make_entry(kind, new_ie.name,
-                                                  new_ie.parent_id, file_id)
+                    self._commit_nested_tree(file_id, path)
             except errors.NoSuchFile:
                 pass
-            # mutter('check %s {%s}', path, file_id)
-            if (not specific_files or 
-                is_inside_or_parent_of_any(specific_files, path)):
-                    # mutter('%s selected for commit', path)
-                    ie = new_ie.copy()
+
+            # Record an entry for this item
+            # Note: I don't particularly want to have the existing_ie
+            # parameter but the test suite currently (28-Jun-07) breaks
+            # without it thanks to a unicode normalisation issue. :-(
+            definitely_changed = kind != existing_ie.kind 
+            self._record_entry(path, file_id, specific_files, kind, name,
+                parent_id, definitely_changed, existing_ie)
+
+        # Unversion IDs that were found to be deleted
+        self.work_tree.unversion(deleted_ids)
+
+    def _commit_nested_tree(self, file_id, path):
+        "Commit a nested tree."
+        sub_tree = self.work_tree.get_nested_tree(file_id, path)
+        # FIXME: be more comprehensive here:
+        # this works when both trees are in --trees repository,
+        # but when both are bound to a different repository,
+        # it fails; a better way of approaching this is to 
+        # finally implement the explicit-caches approach design
+        # a while back - RBC 20070306.
+        if (sub_tree.branch.repository.bzrdir.root_transport.base
+            ==
+            self.work_tree.branch.repository.bzrdir.root_transport.base):
+            sub_tree.branch.repository = \
+                self.work_tree.branch.repository
+        try:
+            sub_tree.commit(message=None, revprops=self.revprops,
+                recursive=self.recursive,
+                message_callback=self.message_callback,
+                timestamp=self.timestamp, timezone=self.timezone,
+                committer=self.committer,
+                allow_pointless=self.allow_pointless,
+                strict=self.strict, verbose=self.verbose,
+                local=self.local, reporter=self.reporter)
+        except errors.PointlessCommit:
+            pass
+
+    def _record_entry(self, path, file_id, specific_files, kind, name,
+                      parent_id, definitely_changed, existing_ie=None):
+        "Record the new inventory entry for a path if any."
+        # mutter('check %s {%s}', path, file_id)
+        if (not specific_files or 
+            is_inside_or_parent_of_any(specific_files, path)):
+                # mutter('%s selected for commit', path)
+                if definitely_changed or existing_ie is None:
+                    ie = inventory.make_entry(kind, name, parent_id, file_id)
+                else:
+                    ie = existing_ie.copy()
                     ie.revision = None
+        else:
+            # mutter('%s not selected for commit', path)
+            if self.basis_inv.has_id(file_id):
+                ie = self.basis_inv[file_id].copy()
             else:
-                # mutter('%s not selected for commit', path)
-                if self.basis_inv.has_id(file_id):
-                    ie = self.basis_inv[file_id].copy()
-                else:
-                    # this entry is new and not being committed
-                    continue
+                # this entry is new and not being committed
+                ie = None
+        if ie is not None:
             self.builder.record_entry_contents(ie, self.parent_invs, 
                 path, self.work_tree)
-            # describe the nature of the change that has occurred relative to
-            # the basis inventory.
-            if (self.basis_inv.has_id(ie.file_id)):
-                basis_ie = self.basis_inv[ie.file_id]
-            else:
-                basis_ie = None
-            change = ie.describe_change(basis_ie, ie)
-            if change in (InventoryEntry.RENAMED, 
-                InventoryEntry.MODIFIED_AND_RENAMED):
-                old_path = self.basis_inv.id2path(ie.file_id)
-                self.reporter.renamed(change, old_path, path)
-            else:
-                self.reporter.snapshot_change(change, path)
-
-        # Unversion IDs that were found to be deleted
-        self.work_tree.unversion(deleted_ids)
-
-        # If specific files/directories were nominated, it is possible
-        # that some data from outside those needs to be preserved from
-        # the basis tree. For example, if a file x is moved from out of
-        # directory foo into directory bar and the user requests
-        # ``commit foo``, then information about bar/x must also be
-        # recorded.
-        if specific_files:
-            for path, new_ie in self.basis_inv.iter_entries():
-                if new_ie.file_id in work_inv:
-                    continue
-                if is_inside_any(specific_files, path):
-                    continue
-                ie = new_ie.copy()
-                ie.revision = None
-                self.builder.record_entry_contents(ie, self.parent_invs, path,
-                                                   self.basis_tree)
-
-        # Report what was deleted. We could skip this when no deletes are
-        # detected to gain a performance win, but it arguably serves as a
-        # 'safety check' by informing the user whenever anything disappears.
-        for path, ie in self.basis_inv.iter_entries():
-            if ie.file_id not in self.builder.new_inventory:
-                self.reporter.deleted(path)
-
-    def _emit_progress_set_stage(self, name, show_entries=False):
+            self._report_change(ie, path)
+        return ie
+
+    def _report_change(self, ie, path):
+        """Report a change to the user.
+
+        The change that has occurred is described relative to the basis
+        inventory.
+        """
+        if (self.basis_inv.has_id(ie.file_id)):
+            basis_ie = self.basis_inv[ie.file_id]
+        else:
+            basis_ie = None
+        change = ie.describe_change(basis_ie, ie)
+        if change in (InventoryEntry.RENAMED, 
+            InventoryEntry.MODIFIED_AND_RENAMED):
+            old_path = self.basis_inv.id2path(ie.file_id)
+            self.reporter.renamed(change, old_path, path)
+        else:
+            self.reporter.snapshot_change(change, path)
+
+    def _set_progress_stage(self, name, entries_title=None):
         """Set the progress stage and emit an update to the progress bar."""
         self.pb_stage_name = name
         self.pb_stage_count += 1
-        self.pb_entries_show = show_entries
-        if show_entries:
+        self.pb_entries_title = entries_title
+        if entries_title is not None:
             self.pb_entries_count = 0
             self.pb_entries_total = '?'
         self._emit_progress()
 
-    def _emit_progress_next_entry(self):
-        """Emit an update to the progress bar and increment the file count."""
+    def _next_progress_entry(self):
+        """Emit an update to the progress bar and increment the entry count."""
         self.pb_entries_count += 1
         self._emit_progress()
 
     def _emit_progress(self):
-        if self.pb_entries_show:
-            text = "%s [Entry %d/%s] - Stage" % (self.pb_stage_name,
-                self.pb_entries_count,str(self.pb_entries_total))
+        if self.pb_entries_title:
+            if self.pb_entries_total == '?':
+                text = "%s [%s %d] - Stage" % (self.pb_stage_name,
+                    self.pb_entries_title, self.pb_entries_count)
+            else:
+                text = "%s [%s %d/%s] - Stage" % (self.pb_stage_name,
+                    self.pb_entries_title, self.pb_entries_count,
+                    str(self.pb_entries_total))
         else:
             text = "%s - Stage" % (self.pb_stage_name)
         self.pb.update(text, self.pb_stage_count, self.pb_stage_total)

=== modified file 'bzrlib/memorytree.py'
--- a/bzrlib/memorytree.py	2007-07-11 19:44:51 +0000
+++ b/bzrlib/memorytree.py	2007-07-13 06:22:45 +0000
@@ -83,13 +83,25 @@
         """See Tree.get_file."""
         return self._file_transport.get(self.id2path(file_id))
 
-    def get_file_sha1(self, file_id, path=None):
+    def get_file_sha1(self, file_id, path=None, stat_value=None):
         """See Tree.get_file_sha1()."""
         if path is None:
             path = self.id2path(file_id)
         stream = self._file_transport.get(path)
         return sha_file(stream)
 
+    def _comparison_data(self, entry, path):
+        """See Tree._comparison_data."""
+        if entry is None:
+            return None, False, None
+        return entry.kind, entry.executable, None
+
+    def _file_size(self, entry, stat_value):
+        """See Tree._file_size."""
+        if entry is None:
+            return 0
+        return entry.text_size
+
     @needs_read_lock
     def get_parent_ids(self):
         """See Tree.get_parent_ids.

=== modified file 'bzrlib/tests/workingtree_implementations/test_commit.py'
--- a/bzrlib/tests/workingtree_implementations/test_commit.py	2007-07-12 07:22:52 +0000
+++ b/bzrlib/tests/workingtree_implementations/test_commit.py	2007-07-13 06:22:45 +0000
@@ -354,15 +354,10 @@
         # into the factory for this test - just make the test ui factory
         # pun as a reporter. Then we can check the ordering is right.
         tree.commit('second post', specific_files=['b'])
-        # 4 steps, the first of which is reported 5 times, once per file
-        # 2 files don't trigger an update, as 'a' and 'c' are not 
-        # committed.
+        # 4 steps, the first of which is reported 2 times, once per dir
         self.assertEqual(
-            [('update', 1, 4, 'Collecting changes [Entry 0/?] - Stage'),
-             ('update', 1, 4, 'Collecting changes [Entry 1/4] - Stage'),
-             ('update', 1, 4, 'Collecting changes [Entry 2/4] - Stage'),
-             ('update', 1, 4, 'Collecting changes [Entry 3/4] - Stage'),
-             ('update', 1, 4, 'Collecting changes [Entry 4/4] - Stage'),
+            [('update', 1, 4, 'Collecting changes [Directory 0] - Stage'),
+             ('update', 1, 4, 'Collecting changes [Directory 1] - Stage'),
              ('update', 2, 4, 'Saving data locally - Stage'),
              ('update', 3, 4, 'Updating the working tree - Stage'),
              ('update', 4, 4, 'Running post commit hooks - Stage')],
@@ -383,8 +378,8 @@
         branch.Branch.hooks.name_hook(a_hook, 'hook name')
         tree.commit('first post')
         self.assertEqual(
-            [('update', 1, 4, 'Collecting changes [Entry 0/?] - Stage'),
-             ('update', 1, 4, 'Collecting changes [Entry 1/1] - Stage'),
+            [('update', 1, 4, 'Collecting changes [Directory 0] - Stage'),
+             ('update', 1, 4, 'Collecting changes [Directory 1] - Stage'),
              ('update', 2, 4, 'Saving data locally - Stage'),
              ('update', 3, 4, 'Updating the working tree - Stage'),
              ('update', 4, 4, 'Running post commit hooks - Stage'),




More information about the bazaar-commits mailing list