Rev 2879: (robertc) Move serialisation of inventory entries into CommitBuilder. (Robert Collins) in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Wed Oct 3 06:04:44 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2879
revision-id: pqm at pqm.ubuntu.com-20071003050442-e0x9ofdfo0hwxnal
parent: pqm at pqm.ubuntu.com-20071003042319-er7zwdzllxy1ywg7
parent: robertc at robertcollins.net-20071003042301-a8uqkwlqr5owcqbo
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Wed 2007-10-03 06:04:42 +0100
message:
  (robertc) Move serialisation of inventory entries into CommitBuilder. (Robert Collins)
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
  bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
  bzrlib/memorytree.py           memorytree.py-20060906023413-4wlkalbdpsxi2r4y-1
  bzrlib/repofmt/weaverepo.py    presplitout.py-20070125045333-wfav3tsh73oxu3zk-1
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
  bzrlib/tests/interrepository_implementations/test_interrepository.py test_interrepository.py-20060220061411-1ec13fa99e5e3eee
  bzrlib/tests/repository_implementations/test_commit_builder.py test_commit_builder.py-20060606110838-76e3ra5slucqus81-1
  bzrlib/tests/workingtree_implementations/test_inv.py test_inv.py-20070311221604-ighlq8tbn5xq0kuo-1
    ------------------------------------------------------------
    revno: 2776.1.7.1.19
    merged: robertc at robertcollins.net-20071003042301-a8uqkwlqr5owcqbo
    parent: robertc at robertcollins.net-20071003000505-ls3k5qm7nw1us6j4
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Wed 2007-10-03 14:23:01 +1000
    message:
      Final review tweaks.
    ------------------------------------------------------------
    revno: 2776.1.7.1.18
    merged: robertc at robertcollins.net-20071003000505-ls3k5qm7nw1us6j4
    parent: robertc at robertcollins.net-20071002010803-iu712x2x1vl0algc
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Wed 2007-10-03 10:05:05 +1000
    message:
      Review feedback.
    ------------------------------------------------------------
    revno: 2776.1.7.1.17
    merged: robertc at robertcollins.net-20071002010803-iu712x2x1vl0algc
    parent: robertc at robertcollins.net-20071002010655-7ba9irkoevv3rnqa
    parent: pqm at pqm.ubuntu.com-20070928041435-uls0r9txks111272
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Tue 2007-10-02 11:08:03 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.16
    merged: robertc at robertcollins.net-20071002010655-7ba9irkoevv3rnqa
    parent: robertc at robertcollins.net-20070927213849-fo6b41yk0pfzym1x
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Tue 2007-10-02 11:06:55 +1000
    message:
      Remove irrelevant TODO.
    ------------------------------------------------------------
    revno: 2776.1.7.1.15
    merged: robertc at robertcollins.net-20070927213849-fo6b41yk0pfzym1x
    parent: robertc at robertcollins.net-20070927211138-ebsu1bo1qz9f1w8n
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Fri 2007-09-28 07:38:49 +1000
    message:
      Put checking for recursive commits back to the commit driver, not commit builder.
    ------------------------------------------------------------
    revno: 2776.1.7.1.14
    merged: robertc at robertcollins.net-20070927211138-ebsu1bo1qz9f1w8n
    parent: robertc at robertcollins.net-20070921061031-p945q13ra1jjcli2
    parent: pqm at pqm.ubuntu.com-20070927065437-oc8enxoek80hyxod
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Fri 2007-09-28 07:11:38 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.13
    merged: robertc at robertcollins.net-20070921061031-p945q13ra1jjcli2
    parent: robertc at robertcollins.net-20070917061036-fzr1h7zs9b3rr573
    parent: pqm at pqm.ubuntu.com-20070921051316-85muv96iv0duh31j
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Fri 2007-09-21 16:10:31 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.12
    merged: robertc at robertcollins.net-20070917061036-fzr1h7zs9b3rr573
    parent: robertc at robertcollins.net-20070917014525-s9rxjft45ka581fg
    parent: robertc at robertcollins.net-20070917053356-05fyvmd1b3xalx6h
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Mon 2007-09-17 16:10:36 +1000
    message:
      Merge rename_one fix which fixes a test suite api violation allowing more commit refactoring.
    ------------------------------------------------------------
    revno: 2776.1.7.1.11
    merged: robertc at robertcollins.net-20070917014525-s9rxjft45ka581fg
    parent: robertc at robertcollins.net-20070914034315-pyf153e6rrw792z4
    parent: pqm at pqm.ubuntu.com-20070917005035-cshdkpzbj63id1uw
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Mon 2007-09-17 11:45:25 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.10
    merged: robertc at robertcollins.net-20070914034315-pyf153e6rrw792z4
    parent: robertc at robertcollins.net-20070914000845-qac6s3w4xj41ivmh
    parent: pqm at pqm.ubuntu.com-20070914031609-ccdhl0ebzrohpa1t
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Fri 2007-09-14 13:43:15 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.9
    merged: robertc at robertcollins.net-20070914000845-qac6s3w4xj41ivmh
    parent: robertc at robertcollins.net-20070913032229-vdojubs6akxlk4ao
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Fri 2007-09-14 10:08:45 +1000
    message:
      Unbreak weaves.
    ------------------------------------------------------------
    revno: 2776.1.7.1.8
    merged: robertc at robertcollins.net-20070913032229-vdojubs6akxlk4ao
    parent: robertc at robertcollins.net-20070912050636-c6fu3lybn1tbftk5
    parent: pqm at pqm.ubuntu.com-20070912222627-zvqit350mf6gvrbh
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Thu 2007-09-13 13:22:29 +1000
    message:
      Merge bzr.dev
    ------------------------------------------------------------
    revno: 2776.1.7.1.7
    merged: robertc at robertcollins.net-20070912050636-c6fu3lybn1tbftk5
    parent: robertc at robertcollins.net-20070911090940-2lbw0h6ev8z2zp6w
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Wed 2007-09-12 15:06:36 +1000
    message:
      Remove another stat by using path_content_summary to avoid a has_filename call.
    ------------------------------------------------------------
    revno: 2776.1.7.1.6
    merged: robertc at robertcollins.net-20070911090940-2lbw0h6ev8z2zp6w
    parent: robertc at robertcollins.net-20070909224707-s1vrl45nvle0k51m
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Tue 2007-09-11 19:09:40 +1000
    message:
      Fixup various commit test failures falling out from the other commit changes.
    ------------------------------------------------------------
    revno: 2776.1.7.1.5
    merged: robertc at robertcollins.net-20070909224707-s1vrl45nvle0k51m
    parent: robertc at robertcollins.net-20070906010906-2so4bg4lr3y8wlqd
    parent: pqm at pqm.ubuntu.com-20070907145828-hjh5941jv7y8d9z8
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Mon 2007-09-10 08:47:07 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.4
    merged: robertc at robertcollins.net-20070906010906-2so4bg4lr3y8wlqd
    parent: robertc at robertcollins.net-20070905230807-xxl20o4evg2wrswh
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Thu 2007-09-06 11:09:06 +1000
    message:
      Move content summary generation outside of record_entry_contents.
    ------------------------------------------------------------
    revno: 2776.1.7.1.3
    merged: robertc at robertcollins.net-20070905230807-xxl20o4evg2wrswh
    parent: robertc at robertcollins.net-20070905055134-pwbueao0qq6krf9u
    parent: pqm at pqm.ubuntu.com-20070905084824-xdwd8f4fioovdi9v
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Thu 2007-09-06 09:08:07 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2776.1.7.1.2
    merged: robertc at robertcollins.net-20070905055134-pwbueao0qq6krf9u
    parent: robertc at robertcollins.net-20070905035316-1gp1s0fx6gmvbnqu
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Wed 2007-09-05 15:51:34 +1000
    message:
      nuke _read_tree_state and snapshot from inventory, moving responsibility into the commit builder.
    ------------------------------------------------------------
    revno: 2776.1.7.1.1
    merged: robertc at robertcollins.net-20070905035316-1gp1s0fx6gmvbnqu
    parent: robertc at robertcollins.net-20070904100858-971b6sssddwfmwrw
    parent: robertc at robertcollins.net-20070905035159-6j1mh2gnrhpez5jp
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit
    timestamp: Wed 2007-09-05 13:53:16 +1000
    message:
      Knit support for commit refactoring.
=== modified file 'NEWS'
--- a/NEWS	2007-10-03 03:14:53 +0000
+++ b/NEWS	2007-10-03 05:04:42 +0000
@@ -164,6 +164,10 @@
      duplication from user input, when a user mentions both a path and an item
      contained within that path. (Robert Collins)
 
+   * New method on ``bzrlib.tree.Tree`` ``path_content_summary`` provides a
+     tuple containing the key information about a path for commit processing
+     to complete. (Robert Collins)
+
    * New method on xml serialisers, write_inventory_to_lines, which matches the
      API used by knits for adding content. (Robert Collins)
 
@@ -468,10 +472,6 @@
      incremental addition of data to a file without requiring that all the
      data be buffered in memory. (Robert Collins)
 
-   * New method on ``bzrlib.tree.Tree`` ``path_content_summary`` provides a
-     tuple containing the key information about a path for commit processing
-     to complete. (Robert Collins)
-
    * New methods on ``bzrlib.knit.KnitVersionedFile``:
      ``get_data_stream(versions)``, ``insert_data_stream(stream)`` and
      ``get_format_signature()``.  These provide some infrastructure for

=== modified file 'bzrlib/commit.py'
--- a/bzrlib/commit.py	2007-09-21 04:22:53 +0000
+++ b/bzrlib/commit.py	2007-10-02 01:06:55 +0000
@@ -247,6 +247,7 @@
         self.local = local
         self.master_branch = None
         self.master_locked = False
+        self.recursive = recursive
         self.rev_id = None
         if specific_files is not None:
             self.specific_files = sorted(
@@ -255,7 +256,6 @@
             self.specific_files = None
         self.specific_file_ids = None
         self.allow_pointless = allow_pointless
-        self.recursive = recursive
         self.revprops = revprops
         self.message_callback = message_callback
         self.timestamp = timestamp
@@ -289,12 +289,11 @@
 
             # If provided, ensure the specified files are versioned
             if self.specific_files is not None:
-                # Note: This routine
-                # is being called because it raises PathNotVerisonedError
-                # as a side effect of finding the IDs. We later use the ids we
-                # found as input to the working tree inventory iterator, so we
-                # only consider those ids rather than examining the whole tree
-                # again.
+                # Note: This routine is being called because it raises
+                # PathNotVersionedError as a side effect of finding the IDs. We
+                # later use the ids we found as input to the working tree
+                # inventory iterator, so we only consider those ids rather than
+                # examining the whole tree again.
                 # XXX: Dont we have filter_unversioned to do this more
                 # cheaply?
                 self.specific_file_ids = tree.find_ids_across_trees(
@@ -650,11 +649,16 @@
         if specific_files:
             for path, old_ie in self.basis_inv.iter_entries():
                 if old_ie.file_id in self.builder.new_inventory:
+                    # already added - skip.
                     continue
                 if is_inside_any(specific_files, path):
+                    # was inside the selected path, if not present it has been
+                    # deleted so skip.
                     continue
                 if old_ie.kind == 'directory':
                     self._next_progress_entry()
+                # not in final inv yet, was not in the selected files, so is an
+                # entry to be preserved unaltered.
                 ie = old_ie.copy()
                 # Note: specific file commits after a merge are currently
                 # prohibited. This test is for sanity/safety in case it's
@@ -662,7 +666,7 @@
                 if len(self.parents) > 1:
                     ie.revision = None
                 if self.builder.record_entry_contents(ie, self.parent_invs, path,
-                    self.basis_tree):
+                    self.basis_tree, None):
                     self.any_entries_changed = True
 
         # note that deletes have occurred
@@ -686,6 +690,7 @@
         deleted_paths = set()
         work_inv = self.work_tree.inventory
         assert work_inv.root is not None
+        # XXX: Note that entries may have the wrong kind.
         entries = work_inv.iter_entries_by_dir(
             specific_file_ids=self.specific_file_ids, yield_parents=True)
         if not self.builder.record_root_entry:
@@ -704,18 +709,32 @@
             # deleted files matching that filter.
             if is_inside_any(deleted_paths, path):
                 continue
-            if not self.work_tree.has_filename(path):
-                deleted_paths.add(path)
-                self.reporter.missing(path)
-                deleted_ids.append(file_id)
-                continue
-            try:
-                kind = self.work_tree.kind(file_id)
-                # TODO: specific_files filtering before nested tree processing
-                if kind == 'tree-reference' and self.recursive == 'down':
-                    self._commit_nested_tree(file_id, path)
-            except errors.NoSuchFile:
-                pass
+            content_summary = self.work_tree.path_content_summary(path)
+            if not specific_files or is_inside_any(specific_files, path):
+                if content_summary[0] == 'missing':
+                    deleted_paths.add(path)
+                    self.reporter.missing(path)
+                    deleted_ids.append(file_id)
+                    continue
+            # TODO: have the builder do the nested commit just-in-time IF and
+            # only if needed.
+            if content_summary[0] == 'tree-reference':
+                # enforce repository nested tree policy.
+                if (not self.work_tree.supports_tree_reference() or
+                    # repository does not support it either.
+                    not self.branch.repository._format.supports_tree_reference):
+                    content_summary = ('directory',) + content_summary[1:]
+            kind = content_summary[0]
+            # TODO: specific_files filtering before nested tree processing
+            if kind == 'tree-reference':
+                if self.recursive == 'down':
+                    nested_revision_id = self._commit_nested_tree(
+                        file_id, path)
+                    content_summary = content_summary[:3] + (
+                        nested_revision_id,)
+                else:
+                    content_summary = content_summary[:3] + (
+                        self.work_tree.get_reference_revision(file_id),)
 
             # Record an entry for this item
             # Note: I don't particularly want to have the existing_ie
@@ -723,7 +742,8 @@
             # without it thanks to a unicode normalisation issue. :-(
             definitely_changed = kind != existing_ie.kind
             self._record_entry(path, file_id, specific_files, kind, name,
-                parent_id, definitely_changed, existing_ie, report_changes)
+                parent_id, definitely_changed, existing_ie, report_changes,
+                content_summary)
 
         # Unversion IDs that were found to be deleted
         self.work_tree.unversion(deleted_ids)
@@ -742,7 +762,7 @@
             sub_tree.branch.repository = \
                 self.work_tree.branch.repository
         try:
-            sub_tree.commit(message=None, revprops=self.revprops,
+            return sub_tree.commit(message=None, revprops=self.revprops,
                 recursive=self.recursive,
                 message_callback=self.message_callback,
                 timestamp=self.timestamp, timezone=self.timezone,
@@ -751,11 +771,11 @@
                 strict=self.strict, verbose=self.verbose,
                 local=self.local, reporter=self.reporter)
         except errors.PointlessCommit:
-            pass
+            return self.work_tree.get_reference_revision(file_id)
 
     def _record_entry(self, path, file_id, specific_files, kind, name,
-            parent_id, definitely_changed, existing_ie=None,
-            report_changes=True):
+        parent_id, definitely_changed, existing_ie, report_changes,
+        content_summary):
         "Record the new inventory entry for a path if any."
         # mutter('check %s {%s}', path, file_id)
         # mutter('%s selected for commit', path)
@@ -764,8 +784,8 @@
         else:
             ie = existing_ie.copy()
             ie.revision = None
-        if self.builder.record_entry_contents(ie, self.parent_invs, 
-            path, self.work_tree):
+        if self.builder.record_entry_contents(ie, self.parent_invs,
+            path, self.work_tree, content_summary):
             self.any_entries_changed = True
         if report_changes:
             self._report_change(ie, path)

=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py	2007-10-03 02:10:09 +0000
+++ b/bzrlib/inventory.py	2007-10-03 05:04:42 +0000
@@ -432,32 +432,6 @@
                    self.parent_id,
                    self.revision))
 
-    def snapshot(self, revision, path, previous_entries,
-                 work_tree, commit_builder):
-        """Make a snapshot of this entry which may or may not have changed.
-        
-        This means that all its fields are populated, that it has its
-        text stored in the text store or weave.
-
-        :return: True if anything was recorded
-        """
-        # cannot be unchanged unless there is only one parent file rev.
-        self._read_tree_state(path, work_tree)
-        if len(previous_entries) == 1:
-            parent_ie = previous_entries.values()[0]
-            if self._unchanged(parent_ie):
-                self.revision = parent_ie.revision
-                return False
-        self.revision = revision
-        return self._snapshot_text(previous_entries, work_tree, commit_builder)
-
-    def _snapshot_text(self, file_parents, work_tree, commit_builder): 
-        """Record the 'text' of this entry, whatever form that takes.
-
-        :return: True if anything was recorded
-        """
-        raise NotImplementedError(self._snapshot_text)
-
     def __eq__(self, other):
         if not isinstance(other, InventoryEntry):
             return NotImplemented
@@ -582,11 +556,6 @@
         """See InventoryEntry._put_on_disk."""
         os.mkdir(fullpath)
 
-    def _snapshot_text(self, file_parents, work_tree, commit_builder):
-        """See InventoryEntry._snapshot_text."""
-        commit_builder.modified_directory(self.file_id, file_parents)
-        return True
-
 
 class InventoryFile(InventoryEntry):
     """A file in an inventory."""
@@ -716,35 +685,6 @@
     def _forget_tree_state(self):
         self.text_sha1 = None
 
-    def snapshot(self, revision, path, previous_entries,
-                 work_tree, commit_builder):
-        """See InventoryEntry.snapshot."""
-        # Note: We use a custom implementation of this method for files
-        # because it's a performance critical part of commit.
-
-        # If this is the initial commit for this file, we know the sha is
-        # coming later so skip calculating it now (in _read_tree_state())
-        if len(previous_entries) == 0:
-            self.executable = work_tree.is_executable(self.file_id, path=path)
-        else:
-            self._read_tree_state(path, work_tree)
-
-        # If nothing is changed from the sole parent, there's nothing to do
-        if len(previous_entries) == 1:
-            parent_ie = previous_entries.values()[0]
-            if self._unchanged(parent_ie):
-                self.revision = parent_ie.revision
-                return False
-
-        # Add the file to the repository
-        self.revision = revision
-        def get_content_byte_lines():
-            return work_tree.get_file(self.file_id, path).readlines()
-        self.text_sha1, self.text_size = commit_builder.modified_file_text(
-            self.file_id, previous_entries, get_content_byte_lines,
-            self.text_sha1, self.text_size)
-        return True
-
     def _unchanged(self, previous_ie):
         """See InventoryEntry._unchanged."""
         compatible = super(InventoryFile, self)._unchanged(previous_ie)
@@ -845,12 +785,6 @@
             compatible = False
         return compatible
 
-    def _snapshot_text(self, file_parents, work_tree, commit_builder):
-        """See InventoryEntry._snapshot_text."""
-        commit_builder.modified_link(
-            self.file_id, file_parents, self.symlink_target)
-        return True
-
 
 class TreeReference(InventoryEntry):
     
@@ -866,10 +800,6 @@
         return TreeReference(self.file_id, self.name, self.parent_id,
                              self.revision, self.reference_revision)
 
-    def _snapshot_text(self, file_parents, work_tree, commit_builder):
-        commit_builder.modified_reference(self.file_id, file_parents)
-        return True
-
     def _read_tree_state(self, path, work_tree):
         """Populate fields in the inventory entry from the given tree.
         """

=== modified file 'bzrlib/memorytree.py'
--- a/bzrlib/memorytree.py	2007-08-23 00:12:35 +0000
+++ b/bzrlib/memorytree.py	2007-09-05 05:51:34 +0000
@@ -98,6 +98,26 @@
             return None, False, None
         return entry.kind, entry.executable, None
 
+    def path_content_summary(self, path):
+        """See Tree.path_content_summary."""
+        id = self.path2id(path)
+        if id is None:
+            return 'missing', None, None, None
+        kind = self.kind(id)
+        if kind == 'file':
+            bytes = self._file_transport.get_bytes(path)
+            size = len(bytes)
+            executable = self._inventory[id].executable
+            sha1 = None # no stat cache
+            return (kind, size, executable, sha1)
+        elif kind == 'directory':
+            # memory tree does not support nested trees yet.
+            return kind, None, None, None
+        elif kind == 'symlink':
+            raise NotImplementedError('symlink support')
+        else:
+            raise NotImplementedError('unknown kind')
+
     def _file_size(self, entry, stat_value):
         """See Tree._file_size."""
         if entry is None:

=== modified file 'bzrlib/repofmt/weaverepo.py'
--- a/bzrlib/repofmt/weaverepo.py	2007-09-23 22:22:17 +0000
+++ b/bzrlib/repofmt/weaverepo.py	2007-09-27 21:11:38 +0000
@@ -599,11 +599,12 @@
 class WeaveCommitBuilder(CommitBuilder):
     """A builder for weave based repos that don't support ghosts."""
 
-    def _add_text_to_weave(self, file_id, new_lines, parents):
+    def _add_text_to_weave(self, file_id, new_lines, parents, nostore_sha):
         versionedfile = self.repository.weave_store.get_weave_or_empty(
             file_id, self.repository.get_transaction())
         result = versionedfile.add_lines(
-            self._new_revision_id, parents, new_lines)[0:2]
+            self._new_revision_id, parents, new_lines,
+            nostore_sha=nostore_sha)[0:2]
         versionedfile.clear_cache()
         return result
 

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2007-09-25 03:36:29 +0000
+++ b/bzrlib/repository.py	2007-10-03 04:23:01 +0000
@@ -200,14 +200,15 @@
                 ' record_entry_contents, as of bzr 0.10.',
                  DeprecationWarning, stacklevel=2)
             self.record_entry_contents(tree.inventory.root.copy(), parent_invs,
-                                       '', tree)
+                                       '', tree, tree.path_content_summary(''))
         else:
             # In this revision format, root entries have no knit or weave When
             # serializing out to disk and back in root.revision is always
             # _new_revision_id
             ie.revision = self._new_revision_id
 
-    def record_entry_contents(self, ie, parent_invs, path, tree):
+    def record_entry_contents(self, ie, parent_invs, path, tree,
+        content_summary):
         """Record the content of ie from tree into the commit if needed.
 
         Side effect: sets ie.revision when unchanged
@@ -217,90 +218,156 @@
             commit.
         :param path: The path the entry is at in the tree.
         :param tree: The tree which contains this entry and should be used to 
-        obtain content.
+            obtain content.
+        :param content_summary: Summary data from the tree about the paths
+            content - stat, length, exec, sha/link target. This is only
+            accessed when the entry has a revision of None - that is when it is
+            a candidate to commit.
         :return: True if a new version of the entry has been recorded.
             (Committing a merge where a file was only changed on the other side
             will not return True.)
         """
         if self.new_inventory.root is None:
             self._check_root(ie, parent_invs, tree)
+        if ie.revision is None:
+            kind = content_summary[0]
+        else:
+            # ie is carried over from a prior commit
+            kind = ie.kind
+        # XXX: repository specific check for nested tree support goes here - if
+        # the repo doesn't want nested trees we skip it ?
+        if (kind == 'tree-reference' and
+            not self.repository._format.supports_tree_reference):
+            # mismatch between commit builder logic and repository:
+            # this needs the entry creation pushed down into the builder.
+            raise NotImplementedError('Missing repository subtree support.')
+        # transitional assert only, will remove before release.
+        assert ie.kind == kind
         self.new_inventory.add(ie)
 
         # ie.revision is always None if the InventoryEntry is considered
-        # for committing. ie.snapshot will record the correct revision 
-        # which may be the sole parent if it is untouched.
+        # for committing. We may record the previous parents revision if the
+        # content is actually unchanged against a sole head.
         if ie.revision is not None:
             return ie.revision == self._new_revision_id and (path != '' or
                 self._versioned_root)
 
+        # XXX: Friction: parent_candidates should return a list not a dict
+        #      so that we don't have to walk the inventories again.
         parent_candiate_entries = ie.parent_candidates(parent_invs)
-        heads = self.repository.get_graph().heads(parent_candiate_entries.keys())
-        # XXX: Note that this is unordered - and this is tolerable because 
-        # the previous code was also unordered.
-        previous_entries = dict((head, parent_candiate_entries[head]) for head
-            in heads)
-        # we are creating a new revision for ie in the history store and
-        # inventory.
-        ie.snapshot(self._new_revision_id, path, previous_entries, tree, self)
-        return ie.revision == self._new_revision_id and (path != '' or
-            self._versioned_root)
-
-    def modified_directory(self, file_id, file_parents):
-        """Record the presence of a symbolic link.
-
-        :param file_id: The file_id of the link to record.
-        :param file_parents: The per-file parent revision ids.
-        """
-        self._add_text_to_weave(file_id, [], file_parents.keys())
-
-    def modified_reference(self, file_id, file_parents):
-        """Record the modification of a reference.
-
-        :param file_id: The file_id of the link to record.
-        :param file_parents: The per-file parent revision ids.
-        """
-        self._add_text_to_weave(file_id, [], file_parents.keys())
-    
-    def modified_file_text(self, file_id, file_parents,
-                           get_content_byte_lines, text_sha1=None,
-                           text_size=None):
-        """Record the text of file file_id
-
-        :param file_id: The file_id of the file to record the text of.
-        :param file_parents: The per-file parent revision ids.
-        :param get_content_byte_lines: A callable which will return the byte
-            lines for the file.
-        :param text_sha1: Optional SHA1 of the file contents.
-        :param text_size: Optional size of the file contents.
-        """
-        # mutter('storing text of file {%s} in revision {%s} into %r',
-        #        file_id, self._new_revision_id, self.repository.weave_store)
-        # special case to avoid diffing on renames or 
-        # reparenting
-        if (len(file_parents) == 1
-            and text_sha1 == file_parents.values()[0].text_sha1
-            and text_size == file_parents.values()[0].text_size):
-            previous_ie = file_parents.values()[0]
-            versionedfile = self.repository.weave_store.get_weave(file_id,
-                self.repository.get_transaction())
-            versionedfile.clone_text(self._new_revision_id,
-                previous_ie.revision, file_parents.keys())
-            return text_sha1, text_size
+        head_set = self.repository.get_graph().heads(parent_candiate_entries.keys())
+        heads = []
+        for inv in parent_invs:
+            if ie.file_id in inv:
+                old_rev = inv[ie.file_id].revision
+                if old_rev in head_set:
+                    heads.append(inv[ie.file_id].revision)
+                    head_set.remove(inv[ie.file_id].revision)
+
+        store = False
+        # now we check to see if we need to write a new record to the
+        # file-graph.
+        # We write a new entry unless there is one head to the ancestors, and
+        # the kind-derived content is unchanged.
+
+        # Cheapest check first: no ancestors, or more the one head in the
+        # ancestors, we write a new node.
+        if len(heads) != 1:
+            store = True
+        if not store:
+            # There is a single head, look it up for comparison
+            parent_entry = parent_candiate_entries[heads[0]]
+            # if the non-content specific data has changed, we'll be writing a
+            # node:
+            if (parent_entry.parent_id != ie.parent_id or
+                parent_entry.name != ie.name):
+                store = True
+        # now we need to do content specific checks:
+        if not store:
+            # if the kind changed the content obviously has
+            if kind != parent_entry.kind:
+                store = True
+        if kind == 'file':
+            if not store:
+                if (# if the file length changed we have to store:
+                    parent_entry.text_size != content_summary[1] or
+                    # if the exec bit has changed we have to store:
+                    parent_entry.executable != content_summary[2]):
+                    store = True
+                elif parent_entry.text_sha1 == content_summary[3]:
+                    # all meta and content is unchanged (using a hash cache
+                    # hit to check the sha)
+                    ie.revision = parent_entry.revision
+                    ie.text_size = parent_entry.text_size
+                    ie.text_sha1 = parent_entry.text_sha1
+                    ie.executable = parent_entry.executable
+                    return
+                else:
+                    # Either there is only a hash change(no hash cache entry,
+                    # or same size content change), or there is no change on
+                    # this file at all.
+                    # Provide the parent's hash to the store layer, so that the
+                    # content is unchanged we will not store a new node.
+                    nostore_sha = parent_entry.text_sha1
+            if store:
+                # We want to record a new node regardless of the presence or
+                # absence of a content change in the file.
+                nostore_sha = None
+            ie.executable = content_summary[2]
+            lines = tree.get_file(ie.file_id, path).readlines()
+            try:
+                ie.text_sha1, ie.text_size = self._add_text_to_weave(
+                    ie.file_id, lines, heads, nostore_sha)
+            except errors.ExistingContent:
+                # Turns out that the file content was unchanged, and we were
+                # only going to store a new node if it was changed. Carry over
+                # the entry.
+                ie.revision = parent_entry.revision
+                ie.text_size = parent_entry.text_size
+                ie.text_sha1 = parent_entry.text_sha1
+                ie.executable = parent_entry.executable
+                return
+        elif kind == 'directory':
+            if not store:
+                # all data is meta here, nothing specific to directory, so
+                # carry over:
+                ie.revision = parent_entry.revision
+                return
+            lines = []
+            self._add_text_to_weave(ie.file_id, lines, heads, None)
+        elif kind == 'symlink':
+            current_link_target = content_summary[3]
+            if not store:
+                # symlink target is not generic metadata, check if it has
+                # changed.
+                if current_link_target != parent_entry.symlink_target:
+                    store = True
+            if not store:
+                # unchanged, carry over.
+                ie.revision = parent_entry.revision
+                ie.symlink_target = parent_entry.symlink_target
+                return
+            ie.symlink_target = current_link_target
+            lines = []
+            self._add_text_to_weave(ie.file_id, lines, heads, None)
+        elif kind == 'tree-reference':
+            if not store:
+                if content_summary[3] != parent_entry.reference_revision:
+                    store = True
+            if not store:
+                # unchanged, carry over.
+                ie.reference_revision = parent_entry.reference_revision
+                ie.revision = parent_entry.revision
+                return
+            ie.reference_revision = content_summary[3]
+            lines = []
+            self._add_text_to_weave(ie.file_id, lines, heads, None)
         else:
-            new_lines = get_content_byte_lines()
-            return self._add_text_to_weave(file_id, new_lines,
-                file_parents.keys())
-
-    def modified_link(self, file_id, file_parents, link_target):
-        """Record the presence of a symbolic link.
-
-        :param file_id: The file_id of the link to record.
-        :param file_parents: The per-file parent revision ids.
-        :param link_target: Target location of this link.
-        """
-        self._add_text_to_weave(file_id, [], file_parents.keys())
-
-    def _add_text_to_weave(self, file_id, new_lines, parents):
+            raise NotImplementedError('unknown kind')
+        ie.revision = self._new_revision_id
+        return True
+
+    def _add_text_to_weave(self, file_id, new_lines, parents, nostore_sha):
         versionedfile = self.repository.weave_store.get_weave_or_empty(
             file_id, self.repository.get_transaction())
         # Don't change this to add_lines - add_lines_with_ghosts is cheaper
@@ -311,11 +378,13 @@
         # implementation could give us bad output from readlines() so this is
         # not a guarantee of safety. What would be better is always checking
         # the content during test suite execution. RBC 20070912
-        result = versionedfile.add_lines_with_ghosts(
-            self._new_revision_id, parents, new_lines,
-            random_id=self.random_revid, check_content=False)[0:2]
-        versionedfile.clear_cache()
-        return result
+        try:
+            return versionedfile.add_lines_with_ghosts(
+                self._new_revision_id, parents, new_lines,
+                nostore_sha=nostore_sha, random_id=self.random_revid,
+                check_content=False)[0:2]
+        finally:
+            versionedfile.clear_cache()
 
 
 class RootCommitBuilder(CommitBuilder):

=== modified file 'bzrlib/tests/interrepository_implementations/test_interrepository.py'
--- a/bzrlib/tests/interrepository_implementations/test_interrepository.py	2007-09-14 02:19:41 +0000
+++ b/bzrlib/tests/interrepository_implementations/test_interrepository.py	2007-09-27 21:11:38 +0000
@@ -305,7 +305,8 @@
             committer="Foo Bar <foo at example.com>",
             revision_id='ghost')
         ie = bzrlib.inventory.InventoryDirectory('TREE_ROOT', '', None)
-        builder.record_entry_contents(ie, [], '', None)
+        builder.record_entry_contents(ie, [], '', None,
+            ('directory', None, None, None))
         builder.finish_inventory()
         builder.commit("Message")
         repo.unlock()

=== modified file 'bzrlib/tests/repository_implementations/test_commit_builder.py'
--- a/bzrlib/tests/repository_implementations/test_commit_builder.py	2007-09-23 20:25:33 +0000
+++ b/bzrlib/tests/repository_implementations/test_commit_builder.py	2007-09-27 21:11:38 +0000
@@ -51,7 +51,8 @@
                 tree.unlock()
             parent_tree = tree.branch.repository.revision_tree(None)
             parent_invs = []
-            builder.record_entry_contents(ie, parent_invs, '', tree)
+            builder.record_entry_contents(ie, parent_invs, '', tree,
+                tree.path_content_summary(''))
 
     def test_finish_inventory(self):
         tree = self.make_branch_and_tree(".")
@@ -133,7 +134,8 @@
             builder = tree.branch.get_commit_builder([])
             self.callDeprecated(['Root entry should be supplied to'
                 ' record_entry_contents, as of bzr 0.10.'],
-                builder.record_entry_contents, entry, [], 'foo', tree)
+                builder.record_entry_contents, entry, [], 'foo', tree,
+                    tree.path_content_summary('foo'))
             builder.finish_inventory()
             rev_id = builder.commit('foo bar')
         finally:
@@ -151,7 +153,8 @@
             ie = inventory.make_entry('directory', '', None,
                     tree.inventory.root.file_id)
             self.assertFalse(builder.record_entry_contents(
-                ie, [parent_tree.inventory], '', tree))
+                ie, [parent_tree.inventory], '', tree,
+                tree.path_content_summary('')))
             builder.abort()
         except:
             builder.abort()
@@ -332,13 +335,15 @@
             # root
             builder.record_entry_contents(
                 inventory.make_entry('directory', '', None,
-                    tree.inventory.root.file_id), parent_invs, '', tree)
+                    tree.inventory.root.file_id), parent_invs, '', tree,
+                    tree.path_content_summary(''))
             def commit_id(file_id):
                 old_ie = tree.inventory[file_id]
                 path = tree.id2path(file_id)
                 ie = inventory.make_entry(tree.kind(file_id), old_ie.name,
                     old_ie.parent_id, file_id)
-                return builder.record_entry_contents(ie, parent_invs, path, tree)
+                return builder.record_entry_contents(ie, parent_invs, path,
+                    tree, tree.path_content_summary(path))
 
             file_id = name + 'id'
             parent_id = tree.inventory[file_id].parent_id

=== modified file 'bzrlib/tests/workingtree_implementations/test_inv.py'
--- a/bzrlib/tests/workingtree_implementations/test_inv.py	2007-10-03 02:10:09 +0000
+++ b/bzrlib/tests/workingtree_implementations/test_inv.py	2007-10-03 05:04:42 +0000
@@ -43,93 +43,6 @@
         self.assertEqual(len(wt.inventory), 1)
 
 
-class TestSnapshot(TestCaseWithWorkingTree):
-
-    def setUp(self):
-        # for full testing we'll need a branch
-        # with a subdir to test parent changes.
-        # and a file, link and dir under that.
-        # but right now I only need one attribute
-        # to change, and then test merge patterns
-        # with fake parent entries.
-        super(TestSnapshot, self).setUp()
-        self.wt = self.make_branch_and_tree('.')
-        self.branch = self.wt.branch
-        self.build_tree(['subdir/', 'subdir/file'], line_endings='binary')
-        self.wt.add(['subdir', 'subdir/file'],
-                                       ['dirid', 'fileid'])
-        self.wt.commit('message_1', rev_id = '1')
-        self.tree_1 = self.branch.repository.revision_tree('1')
-        self.inv_1 = self.branch.repository.get_inventory('1')
-        self.file_1 = self.inv_1['fileid']
-        self.wt.lock_write()
-        self.addCleanup(self.wt.unlock)
-        self.file_active = self.wt.inventory['fileid']
-        self.builder = self.branch.get_commit_builder([], timestamp=time.time(), revision_id='2')
-
-    def test_snapshot_new_revision(self):
-        # This tests that a simple commit with no parents makes a new
-        # revision value in the inventory entry
-        self.file_active.snapshot('2', 'subdir/file', {}, self.wt, self.builder)
-        # expected outcome - file_1 has a revision id of '2', and we can get
-        # its text of 'file contents' out of the weave.
-        self.assertEqual(self.file_1.revision, '1')
-        self.assertEqual(self.file_active.revision, '2')
-        # this should be a separate test probably, but lets check it once..
-        lines = self.branch.repository.weave_store.get_weave(
-            'fileid', 
-            self.branch.repository.get_transaction()).get_lines('2')
-        self.assertEqual(lines, ['contents of subdir/file\n'])
-        self.wt.branch.repository.commit_write_group()
-
-    def test_snapshot_unchanged(self):
-        #This tests that a simple commit does not make a new entry for
-        # an unchanged inventory entry
-        self.file_active.snapshot('2', 'subdir/file', {'1':self.file_1},
-                                  self.wt, self.builder)
-        self.assertEqual(self.file_1.revision, '1')
-        self.assertEqual(self.file_active.revision, '1')
-        vf = self.branch.repository.weave_store.get_weave(
-            'fileid', 
-            self.branch.repository.get_transaction())
-        self.assertRaises(errors.RevisionNotPresent,
-                          vf.get_lines,
-                          '2')
-        self.wt.branch.repository.commit_write_group()
-
-    def test_snapshot_merge_identical_different_revid(self):
-        # This tests that a commit with two identical parents, one of which has
-        # a different revision id, results in a new revision id in the entry.
-        # 1->other, commit a merge of other against 1, results in 2.
-        other_ie = inventory.InventoryFile('fileid', 'newname', self.file_1.parent_id)
-        other_ie = inventory.InventoryFile('fileid', 'file', self.file_1.parent_id)
-        other_ie.revision = '1'
-        other_ie.text_sha1 = self.file_1.text_sha1
-        other_ie.text_size = self.file_1.text_size
-        self.assertEqual(self.file_1, other_ie)
-        other_ie.revision = 'other'
-        self.assertNotEqual(self.file_1, other_ie)
-        versionfile = self.branch.repository.weave_store.get_weave(
-            'fileid', self.branch.repository.get_transaction())
-        versionfile.clone_text('other', '1', ['1'])
-        self.file_active.snapshot('2', 'subdir/file', 
-                                  {'1':self.file_1, 'other':other_ie},
-                                  self.wt, self.builder)
-        self.assertEqual(self.file_active.revision, '2')
-        self.wt.branch.repository.commit_write_group()
-
-    def test_snapshot_changed(self):
-        # This tests that a commit with one different parent results in a new
-        # revision id in the entry.
-        self.wt.rename_one('subdir/file', 'subdir/newname')
-        self.file_active = self.wt.inventory['fileid']
-        self.file_active.snapshot('2', 'subdir/newname', {'1':self.file_1}, 
-                                  self.wt, self.builder)
-        # expected outcome - file_1 has a revision id of '2'
-        self.assertEqual(self.file_active.revision, '2')
-        self.wt.branch.repository.commit_write_group()
-
-
 class TestApplyInventoryDelta(TestCaseWithWorkingTree):
 
     def test_add(self):




More information about the bazaar-commits mailing list