Rev 3889: (robertc, jam) Add Repository.add_inventory_by_delta, in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Wed Dec 10 05:12:55 GMT 2008


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 3889
revision-id: pqm at pqm.ubuntu.com-20081210051250-2czm9b99a7e7y0xi
parent: pqm at pqm.ubuntu.com-20081210011933-axdrxiq306imj2ty
parent: john at arbash-meinel.com-20081210043421-2uaz4mfuzw3ca5jz
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Wed 2008-12-10 05:12:50 +0000
message:
  (robertc, jam) Add Repository.add_inventory_by_delta,
  	and use it in the InterDifferingSerializer code.
added:
  bzrlib/tests/per_repository/test_add_inventory_by_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/__init__.py             __init__.py-20050309040759-33e65acf91bbcd5d
  bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
  bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
  bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
  bzrlib/tests/interrepository_implementations/__init__.py __init__.py-20060220054744-baf49a1f88f17b1a
  bzrlib/tests/per_repository/__init__.py __init__.py-20060131092037-9564957a7d4a841b
  bzrlib/tests/per_repository/test_commit_builder.py test_commit_builder.py-20060606110838-76e3ra5slucqus81-1
  bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
    ------------------------------------------------------------
    revno: 3879.2.13
    revision-id: john at arbash-meinel.com-20081210043421-2uaz4mfuzw3ca5jz
    parent: john at arbash-meinel.com-20081210032133-2mxcpa2p81kbpi5c
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Tue 2008-12-09 22:34:21 -0600
    message:
      There was a test that asserted we called pb.update() with the last revision.
    modified:
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3879.2.12
    revision-id: john at arbash-meinel.com-20081210032133-2mxcpa2p81kbpi5c
    parent: john at arbash-meinel.com-20081210001000-xsgsn2kt5ce6dfl2
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Tue 2008-12-09 21:21:33 -0600
    message:
      Missed an add_inventory_delta => add_inventory_by_delta.
    modified:
      bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
    ------------------------------------------------------------
    revno: 3879.2.11
    revision-id: john at arbash-meinel.com-20081210001000-xsgsn2kt5ce6dfl2
    parent: john at arbash-meinel.com-20081207173303-ydamo2rxs3ngjhw0
    parent: pqm at pqm.ubuntu.com-20081209163533-fj6hx9l65sretbai
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Tue 2008-12-09 18:10:00 -0600
    message:
      Merge bzr.dev, resolve NEWS
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/builtins.py             builtins.py-20050830033751-fc01482b9ca23183
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/registry.py             lazy_factory.py-20060809213415-2gfvqadtvdn0phtg-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/blackbox/test_pull.py test_pull.py-20051201144907-64959364f629947f
      bzrlib/tests/bzrdir_implementations/test_bzrdir.py test_bzrdir.py-20060131065642-0ebeca5e30e30866
      bzrlib/tests/per_repository/test_add_fallback_repository.py test_add_fallback_re-20080215040003-8w9n4ck9uqdxj18m-1
      bzrlib/tests/test_fetch.py     testfetch.py-20050825090644-f73e07e7dfb1765a
      bzrlib/tests/test_versionedfile.py test_versionedfile.py-20060222045249-db45c9ed14a1c2e5
      bzrlib/transport/__init__.py   transport.py-20050711165921-4978aa7ce1285ad5
      bzrlib/upgrade.py              history2weaves.py-20050818063535-e7d319791c19a8b2
      bzrlib/versionedfile.py        versionedfile.py-20060222045106-5039c71ee3b65490
    ------------------------------------------------------------
    revno: 3879.2.10
    revision-id: john at arbash-meinel.com-20081207173303-ydamo2rxs3ngjhw0
    parent: john at arbash-meinel.com-20081205222549-lri0j1a3wv37wtax
    parent: john at arbash-meinel.com-20081207172622-r3hrmb872nwmezeu
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Sun 2008-12-07 11:33:03 -0600
    message:
      Merge in the new add_inventory_by_delta and handle the new return values.
    modified:
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/per_repository/test_add_inventory_by_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
        ------------------------------------------------------------
        revno: 3879.3.1
        revision-id: john at arbash-meinel.com-20081207172622-r3hrmb872nwmezeu
        parent: john at arbash-meinel.com-20081205172501-a0g7ho4sl29q6dz9
        committer: John Arbash Meinel <john at arbash-meinel.com>
        branch nick: add_inventory_by_delta
        timestamp: Sun 2008-12-07 11:26:22 -0600
        message:
          Change the return of add_inventory_by_delta to also return the Inventory.
        modified:
          bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3879.2.9
    revision-id: john at arbash-meinel.com-20081205222549-lri0j1a3wv37wtax
    parent: john at arbash-meinel.com-20081205221928-kzstz04ngqrxpb12
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 16:25:49 -0600
    message:
      Use a last-modified-revision test.
      
      This avoids copying the same text revisions multiple times.
    modified:
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3879.2.8
    revision-id: john at arbash-meinel.com-20081205221928-kzstz04ngqrxpb12
    parent: john at arbash-meinel.com-20081205221847-hs9mh3yuinxq7w29
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 16:19:28 -0600
    message:
      Bring in the CHK inter-differing-serializer fetch code.
      
      Refactor it into several helper functions which makes the flow a bit
      clearer and reduces the indentation.
    modified:
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3879.2.7
    revision-id: john at arbash-meinel.com-20081205221847-hs9mh3yuinxq7w29
    parent: john at arbash-meinel.com-20081205221809-d3c4cz1jyv9p1y1h
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 16:18:47 -0600
    message:
      Add an Inter test that actually uses InterDifferingSerializer
    modified:
      bzrlib/tests/interrepository_implementations/__init__.py __init__.py-20060220054744-baf49a1f88f17b1a
    ------------------------------------------------------------
    revno: 3879.2.6
    revision-id: john at arbash-meinel.com-20081205221809-d3c4cz1jyv9p1y1h
    parent: john at arbash-meinel.com-20081205172501-a0g7ho4sl29q6dz9
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 16:18:09 -0600
    message:
      Bring in Inventory._make_delta
    modified:
      bzrlib/inventory.py            inventory.py-20050309040759-6648b84ca2005b37
    ------------------------------------------------------------
    revno: 3879.2.5
    revision-id: john at arbash-meinel.com-20081205172501-a0g7ho4sl29q6dz9
    parent: john at arbash-meinel.com-20081205164010-05sx88jxi50q819a
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: add_inventory_by_delta
    timestamp: Fri 2008-12-05 11:25:01 -0600
    message:
      Change record_delete() to return the delta.
      
      Add direct tests for CB.get_basis_delta(), to ensure that it returns a
      valid delta, and that it errors if the client hasn't called will_record_deletes.
    modified:
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/per_repository/test_commit_builder.py test_commit_builder.py-20060606110838-76e3ra5slucqus81-1
    ------------------------------------------------------------
    revno: 3879.2.4
    revision-id: john at arbash-meinel.com-20081205164010-05sx88jxi50q819a
    parent: john at arbash-meinel.com-20081205162905-12c9k3esfetyes4a
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: add_inventory_by_delta
    timestamp: Fri 2008-12-05 10:40:10 -0600
    message:
      This didn't land in 1.9 or 1.10, so make the minimum version 1.11
    modified:
      bzrlib/__init__.py             __init__.py-20050309040759-33e65acf91bbcd5d
    ------------------------------------------------------------
    revno: 3879.2.3
    revision-id: john at arbash-meinel.com-20081205162905-12c9k3esfetyes4a
    parent: john at arbash-meinel.com-20081205160704-ti2a80z9tvqehwws
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 10:29:05 -0600
    message:
      Hide the .basis_delta variable, and require callers to use .get_basis_delta()
      This allows us to check that the callers were sure they would be
      generating a proper delta, by using CommitBuilder.record_delete() correctly.
    modified:
      bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3879.2.2
    revision-id: john at arbash-meinel.com-20081205160704-ti2a80z9tvqehwws
    parent: john at arbash-meinel.com-20081205160233-f4c61by1u3kf5sqj
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 10:07:04 -0600
    message:
      Rename add_inventory_delta to add_inventory_by_delta.
    renamed:
      bzrlib/tests/per_repository/test_add_inventory_delta.py => bzrlib/tests/per_repository/test_add_inventory_by_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/per_repository/__init__.py __init__.py-20060131092037-9564957a7d4a841b
      bzrlib/tests/per_repository/test_add_inventory_by_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
    ------------------------------------------------------------
    revno: 3879.2.1
    revision-id: john at arbash-meinel.com-20081205160233-f4c61by1u3kf5sqj
    parent: pqm at pqm.ubuntu.com-20081205135154-uwqcpl3lruah9fo3
    parent: robertc at robertcollins.net-20081013044331-ibghe8j4no7uhb55
    committer: John Arbash Meinel <john at arbash-meinel.com>
    branch nick: differ_serializer
    timestamp: Fri 2008-12-05 10:02:33 -0600
    message:
      Merge in the add_inventory_delta code.
    added:
      bzrlib/tests/per_repository/test_add_inventory_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/__init__.py             __init__.py-20050309040759-33e65acf91bbcd5d
      bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/per_repository/__init__.py __init__.py-20060131092037-9564957a7d4a841b
      bzrlib/tests/per_repository/test_commit_builder.py test_commit_builder.py-20060606110838-76e3ra5slucqus81-1
      bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
    ------------------------------------------------------------
    revno: 3775.2.3
    revision-id: robertc at robertcollins.net-20081013044331-ibghe8j4no7uhb55
    parent: robertc at robertcollins.net-20081013043229-dn4s7hfg6h6zcobm
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit-delta
    timestamp: Mon 2008-10-13 15:43:31 +1100
    message:
      Delegate basis inventory calculation during commit to the CommitBuilder object.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/__init__.py             __init__.py-20050309040759-33e65acf91bbcd5d
      bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
    ------------------------------------------------------------
    revno: 3775.2.2
    revision-id: robertc at robertcollins.net-20081013043229-dn4s7hfg6h6zcobm
    parent: robertc at robertcollins.net-20081013002817-xxxsr37afvuhbzdx
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit-delta
    timestamp: Mon 2008-10-13 15:32:29 +1100
    message:
      Teach CommitBuilder to accumulate inventory deltas.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/per_repository/test_commit_builder.py test_commit_builder.py-20060606110838-76e3ra5slucqus81-1
    ------------------------------------------------------------
    revno: 3775.2.1
    revision-id: robertc at robertcollins.net-20081013002817-xxxsr37afvuhbzdx
    parent: pqm at pqm.ubuntu.com-20081012204951-j2dgh06nuzrak1ri
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: commit-delta
    timestamp: Mon 2008-10-13 11:28:17 +1100
    message:
      Create bzrlib.repository.Repository.add_inventory_delta for adding inventories via deltas.
    added:
      bzrlib/tests/per_repository/test_add_inventory_delta.py test_add_inventory_d-20081013002626-rut81igtlqb4590z-1
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/per_repository/__init__.py __init__.py-20060131092037-9564957a7d4a841b
      bzrlib/tests/per_repository/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
=== modified file 'NEWS'
--- a/NEWS	2008-12-10 00:41:16 +0000
+++ b/NEWS	2008-12-10 05:12:50 +0000
@@ -48,6 +48,11 @@
 
   API CHANGES:
 
+    * The logic in commit now delegates inventory basis calculations to
+      the ``CommitBuilder`` object; this requires that the commit builder
+      in use has been updated to support the new ``recording_deletes`` and
+      ``record_delete`` methods. (Robert Collins)
+
   TESTING:
 
   INTERNALS:
@@ -58,6 +63,15 @@
       returned from each pack in turn, in forward I/O order.
       (John Arbash Meinel)
 
+    * New method ``bzrlib.repository.Repository.add_inventory_by_delta``
+      allows adding an inventory via an inventory delta, which can be
+      more efficient for some repository types. (Robert Collins)
+
+    * Repository ``CommitBuilder`` objects can now accumulate an inventory
+      delta. To enable this functionality call ``builder.recording_deletes``
+      and additionally call ``builder.record_delete`` when a delete
+      against the basis occurs. (Robert Collins)
+
     * The default http handler has been changed from pycurl to urllib.
       The default is still pycurl for https connections. (The only
       advantage of pycurl is that it checks ssl certificates.)

=== modified file 'bzrlib/__init__.py'
--- a/bzrlib/__init__.py	2008-12-05 15:34:02 +0000
+++ b/bzrlib/__init__.py	2008-12-10 00:10:00 +0000
@@ -53,8 +53,8 @@
 version_info = (1, 11, 0, 'dev', 0)
 
 
-# API compatibility version: bzrlib is currently API compatible with 1.7.
-api_minimum_version = (1, 7, 0)
+# API compatibility version: bzrlib is currently API compatible with 1.11.
+api_minimum_version = (1, 11, 0)
 
 
 def _format_version_tuple(version_info):

=== modified file 'bzrlib/commit.py'
--- a/bzrlib/commit.py	2008-11-13 05:48:26 +0000
+++ b/bzrlib/commit.py	2008-12-05 16:29:05 +0000
@@ -285,9 +285,6 @@
         self.committer = committer
         self.strict = strict
         self.verbose = verbose
-        # accumulates an inventory delta to the basis entry, so we can make
-        # just the necessary updates to the workingtree's cached basis.
-        self._basis_delta = []
 
         self.work_tree.lock_write()
         self.pb = bzrlib.ui.ui_factory.nested_progress_bar()
@@ -355,8 +352,9 @@
                     entries_title="Directory")
             self.builder = self.branch.get_commit_builder(self.parents,
                 self.config, timestamp, timezone, committer, revprops, rev_id)
-            
+
             try:
+                self.builder.will_record_deletes()
                 # find the location being committed to
                 if self.bound_branch:
                     master_location = self.master_branch.base
@@ -414,7 +412,7 @@
             # Make the working tree up to date with the branch
             self._set_progress_stage("Updating the working tree")
             self.work_tree.update_basis_by_delta(self.rev_id,
-                 self._basis_delta)
+                 self.builder.get_basis_delta())
             self.reporter.completed(new_revno, self.rev_id)
             self._process_post_hooks(old_revno, new_revno)
         finally:
@@ -433,7 +431,7 @@
         # A merge with no effect on files
         if len(self.parents) > 1:
             return
-        # TODO: we could simplify this by using self._basis_delta.
+        # TODO: we could simplify this by using self.builder.basis_delta.
 
         # The initial commit adds a root directory, but this in itself is not
         # a worthwhile commit.
@@ -696,12 +694,10 @@
                 # required after that changes.
                 if len(self.parents) > 1:
                     ie.revision = None
-                delta, version_recorded, _ = self.builder.record_entry_contents(
+                _, version_recorded, _ = self.builder.record_entry_contents(
                     ie, self.parent_invs, path, self.basis_tree, None)
                 if version_recorded:
                     self.any_entries_changed = True
-                if delta:
-                    self._basis_delta.append(delta)
 
     def _report_and_accumulate_deletes(self):
         # XXX: Could the list of deleted paths and ids be instead taken from
@@ -726,7 +722,7 @@
             deleted.sort()
             # XXX: this is not quite directory-order sorting
             for path, file_id in deleted:
-                self._basis_delta.append((path, None, file_id, None))
+                self.builder.record_delete(path, file_id)
                 self.reporter.deleted(path)
 
     def _populate_from_inventory(self):
@@ -863,10 +859,8 @@
             ie.revision = None
         # For carried over entries we don't care about the fs hash - the repo
         # isn't generating a sha, so we're not saving computation time.
-        delta, version_recorded, fs_hash = self.builder.record_entry_contents(
+        _, version_recorded, fs_hash = self.builder.record_entry_contents(
             ie, self.parent_invs, path, self.work_tree, content_summary)
-        if delta:
-            self._basis_delta.append(delta)
         if version_recorded:
             self.any_entries_changed = True
         if report_changes:

=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py	2008-08-09 00:45:30 +0000
+++ b/bzrlib/inventory.py	2008-12-05 22:18:09 +0000
@@ -1233,9 +1233,27 @@
     def has_id(self, file_id):
         return (file_id in self._byid)
 
+    def _make_delta(self, old):
+        """Make an inventory delta from two inventories."""
+        old_ids = set(old)
+        new_ids = set(self)
+        adds = new_ids - old_ids
+        deletes = old_ids - new_ids
+        common = old_ids.intersection(new_ids)
+        delta = []
+        for file_id in deletes:
+            delta.append((old.id2path(file_id), None, file_id, None))
+        for file_id in adds:
+            delta.append((None, self.id2path(file_id), file_id, self[file_id]))
+        for file_id in common:
+            if old[file_id] != self[file_id]:
+                delta.append((old.id2path(file_id), self.id2path(file_id),
+                    file_id, self[file_id]))
+        return delta
+
     def remove_recursive_id(self, file_id):
         """Remove file_id, and children, from the inventory.
-        
+
         :param file_id: A file_id to remove.
         """
         to_find_delete = [self._byid[file_id]]

=== modified file 'bzrlib/remote.py'
--- a/bzrlib/remote.py	2008-12-01 23:50:52 +0000
+++ b/bzrlib/remote.py	2008-12-05 16:07:04 +0000
@@ -766,6 +766,12 @@
         self._ensure_real()
         return self._real_repository.add_inventory(revid, inv, parents)
 
+    def add_inventory_by_delta(self, basis_revision_id, delta, new_revision_id,
+                               parents):
+        self._ensure_real()
+        return self._real_repository.add_inventory_by_delta(basis_revision_id,
+            delta, new_revision_id, parents)
+
     def add_revision(self, rev_id, rev, inv=None, config=None):
         self._ensure_real()
         return self._real_repository.add_revision(

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2008-12-05 15:34:02 +0000
+++ b/bzrlib/repository.py	2008-12-10 04:34:21 +0000
@@ -67,10 +67,10 @@
 class CommitBuilder(object):
     """Provides an interface to build up a commit.
 
-    This allows describing a tree to be committed without needing to 
+    This allows describing a tree to be committed without needing to
     know the internals of the format of the repository.
     """
-    
+
     # all clients should supply tree roots.
     record_root_entry = True
     # the default CommitBuilder does not manage trees whose root is versioned.
@@ -119,6 +119,12 @@
 
         self._generate_revision_if_needed()
         self.__heads = graph.HeadsCache(repository.get_graph()).heads
+        self._basis_delta = []
+        # API compatibility, older code that used CommitBuilder did not call
+        # .record_delete(), which means the delta that is computed would not be
+        # valid. Callers that will call record_delete() should call
+        # .will_record_deletes() to indicate that.
+        self._recording_deletes = False
 
     def _validate_unicode_text(self, text, context):
         """Verify things like commit messages don't have bogus characters."""
@@ -230,15 +236,60 @@
         """Get a delta against the basis inventory for ie."""
         if ie.file_id not in basis_inv:
             # add
-            return (None, path, ie.file_id, ie)
+            result = (None, path, ie.file_id, ie)
+            self._basis_delta.append(result)
+            return result
         elif ie != basis_inv[ie.file_id]:
             # common but altered
             # TODO: avoid tis id2path call.
-            return (basis_inv.id2path(ie.file_id), path, ie.file_id, ie)
+            result = (basis_inv.id2path(ie.file_id), path, ie.file_id, ie)
+            self._basis_delta.append(result)
+            return result
         else:
             # common, unaltered
             return None
 
+    def get_basis_delta(self):
+        """Return the complete inventory delta versus the basis inventory.
+
+        This has been built up with the calls to record_delete and
+        record_entry_contents. The client must have already called
+        will_record_deletes() to indicate that they will be generating a
+        complete delta.
+
+        :return: An inventory delta, suitable for use with apply_delta, or
+            Repository.add_inventory_by_delta, etc.
+        """
+        if not self._recording_deletes:
+            raise AssertionError("recording deletes not activated.")
+        return self._basis_delta
+
+    def record_delete(self, path, file_id):
+        """Record that a delete occured against a basis tree.
+
+        This is an optional API - when used it adds items to the basis_delta
+        being accumulated by the commit builder. It cannot be called unless the
+        method will_record_deletes() has been called to inform the builder that
+        a delta is being supplied.
+
+        :param path: The path of the thing deleted.
+        :param file_id: The file id that was deleted.
+        """
+        if not self._recording_deletes:
+            raise AssertionError("recording deletes not activated.")
+        delta = (path, None, file_id, None)
+        self._basis_delta.append(delta)
+        return delta
+
+    def will_record_deletes(self):
+        """Tell the commit builder that deletes are being notified.
+
+        This enables the accumulation of an inventory delta; for the resulting
+        commit to be valid, deletes against the basis MUST be recorded via
+        builder.record_delete().
+        """
+        self._recording_deletes = True
+
     def record_entry_contents(self, ie, parent_invs, path, tree,
         content_summary):
         """Record the content of ie from tree into the commit if needed.
@@ -296,15 +347,19 @@
         if ie.revision is not None:
             if not self._versioned_root and path == '':
                 # repositories that do not version the root set the root's
-                # revision to the new commit even when no change occurs, and
-                # this masks when a change may have occurred against the basis,
-                # so calculate if one happened.
+                # revision to the new commit even when no change occurs (more
+                # specifically, they do not record a revision on the root; and
+                # the rev id is assigned to the root during deserialisation -
+                # this masks when a change may have occurred against the basis.
+                # To match this we always issue a delta, because the revision
+                # of the root will always be changing.
                 if ie.file_id in basis_inv:
                     delta = (basis_inv.id2path(ie.file_id), path,
                         ie.file_id, ie)
                 else:
                     # add
                     delta = (None, path, ie.file_id, ie)
+                self._basis_delta.append(delta)
                 return delta, False, None
             else:
                 # we don't need to commit this, because the caller already
@@ -615,6 +670,45 @@
         return self._inventory_add_lines(revision_id, parents,
             inv_lines, check_content=False)
 
+    def add_inventory_by_delta(self, basis_revision_id, delta, new_revision_id,
+                               parents):
+        """Add a new inventory expressed as a delta against another revision.
+
+        :param basis_revision_id: The inventory id the delta was created
+            against. (This does not have to be a direct parent.)
+        :param delta: The inventory delta (see Inventory.apply_delta for
+            details).
+        :param new_revision_id: The revision id that the inventory is being
+            added for.
+        :param parents: The revision ids of the parents that revision_id is
+            known to have and are in the repository already. These are supplied
+            for repositories that depend on the inventory graph for revision
+            graph access, as well as for those that pun ancestry with delta
+            compression.
+
+        :returns: (validator, new_inv)
+            The validator(which is a sha1 digest, though what is sha'd is
+            repository format specific) of the serialized inventory, and the
+            resulting inventory.
+        """
+        if not self.is_in_write_group():
+            raise AssertionError("%r not in write group" % (self,))
+        _mod_revision.check_not_reserved_id(new_revision_id)
+        basis_tree = self.revision_tree(basis_revision_id)
+        basis_tree.lock_read()
+        try:
+            # Note that this mutates the inventory of basis_tree, which not all
+            # inventory implementations may support: A better idiom would be to
+            # return a new inventory, but as there is no revision tree cache in
+            # repository this is safe for now - RBC 20081013
+            basis_inv = basis_tree.inventory
+            basis_inv.apply_delta(delta)
+            basis_inv.revision_id = new_revision_id
+            return (self.add_inventory(new_revision_id, basis_inv, parents),
+                    basis_inv)
+        finally:
+            basis_tree.unlock()
+
     def _inventory_add_lines(self, revision_id, parents, lines,
         check_content=True):
         """Store lines in inv_vf and return the sha1 of the inventory."""
@@ -3082,40 +3176,132 @@
             return False
         return True
 
+    def _fetch_batch(self, revision_ids, basis_id, basis_tree):
+        """Fetch across a few revisions.
+
+        :param revision_ids: The revisions to copy
+        :param basis_id: The revision_id of basis_tree
+        :param basis_tree: A tree that is not in revision_ids which should
+            already exist in the target.
+        :return: (basis_id, basis_tree) A new basis to use now that these trees
+            have been copied.
+        """
+        # Walk though all revisions; get inventory deltas, copy referenced
+        # texts that delta references, insert the delta, revision and
+        # signature.
+        text_keys = set()
+        pending_deltas = []
+        pending_revisions = []
+        for tree in self.source.revision_trees(revision_ids):
+            current_revision_id = tree.get_revision_id()
+            delta = tree.inventory._make_delta(basis_tree.inventory)
+            for old_path, new_path, file_id, entry in delta:
+                if new_path is not None:
+                    if not (new_path or self.target.supports_rich_root()):
+                        # We leave the inventory delta in, because that
+                        # will have the deserialised inventory root
+                        # pointer.
+                        continue
+                    # TODO: Do we need:
+                    #       "if entry.revision == current_revision_id" ?
+                    if entry.revision == current_revision_id:
+                        text_keys.add((file_id, entry.revision))
+            revision = self.source.get_revision(current_revision_id)
+            pending_deltas.append((basis_id, delta,
+                current_revision_id, revision.parent_ids))
+            pending_revisions.append(revision)
+            basis_id = current_revision_id
+            basis_tree = tree
+        # Copy file texts
+        from_texts = self.source.texts
+        to_texts = self.target.texts
+        to_texts.insert_record_stream(from_texts.get_record_stream(
+            text_keys, self.target._fetch_order,
+            not self.target._fetch_uses_deltas))
+        # insert deltas
+        for delta in pending_deltas:
+            self.target.add_inventory_by_delta(*delta)
+        # insert signatures and revisions
+        for revision in pending_revisions:
+            try:
+                signature = self.source.get_signature_text(
+                    revision.revision_id)
+                self.target.add_signature_text(revision.revision_id,
+                    signature)
+            except errors.NoSuchRevision:
+                pass
+            self.target.add_revision(revision.revision_id, revision)
+        return basis_id, basis_tree
+
+    def _fetch_all_revisions(self, revision_ids, pb):
+        """Fetch everything for the list of revisions.
+
+        :param revision_ids: The list of revisions to fetch. Must be in
+            topological order.
+        :param pb: A ProgressBar
+        :return: None
+        """
+        basis_id, basis_tree = self._get_basis(revision_ids[0])
+        batch_size = 100
+        for offset in range(0, len(revision_ids), batch_size):
+            self.target.start_write_group()
+            try:
+                pb.update('Transferring revisions', offset,
+                          len(revision_ids))
+                batch = revision_ids[offset:offset+batch_size]
+                basis_id, basis_tree = self._fetch_batch(batch,
+                    basis_id, basis_tree)
+            except:
+                self.target.abort_write_group()
+                raise
+            else:
+                self.target.commit_write_group()
+        pb.update('Transferring revisions', len(revision_ids),
+                  len(revision_ids))
+
     @needs_write_lock
     def fetch(self, revision_id=None, pb=None, find_ghosts=False):
         """See InterRepository.fetch()."""
         revision_ids = self.target.search_missing_revision_ids(self.source,
             revision_id, find_ghosts=find_ghosts).get_keys()
+        if not revision_ids:
+            return 0, 0
         revision_ids = tsort.topo_sort(
             self.source.get_graph().get_parent_map(revision_ids))
-        def revisions_iterator():
-            rev_ids = list(revision_ids)
-            for offset in xrange(0, len(rev_ids), 100):
-                current_revids = rev_ids[offset:offset+100]
-                revisions = self.source.get_revisions(current_revids)
-                trees = self.source.revision_trees(current_revids)
-                keys = [(r,) for r in current_revids]
-                sig_stream = self.source.signatures.get_record_stream(
-                    keys, 'unordered', True)
-                sigs = {}
-                for record in versionedfile.filter_absent(sig_stream):
-                    sigs[record.key[0]] = record.get_bytes_as('fulltext')
-                for rev, tree in zip(revisions, trees):
-                    yield rev, tree, sigs.get(rev.revision_id, None)
         if pb is None:
             my_pb = ui.ui_factory.nested_progress_bar()
             pb = my_pb
         else:
             my_pb = None
         try:
-            install_revisions(self.target, revisions_iterator(),
-                              len(revision_ids), pb)
+            self._fetch_all_revisions(revision_ids, pb)
         finally:
             if my_pb is not None:
                 my_pb.finished()
         return len(revision_ids), 0
 
+    def _get_basis(self, first_revision_id):
+        """Get a revision and tree which exists in the target.
+
+        This assumes that first_revision_id is selected for transmission
+        because all other ancestors are already present. If we can't find an
+        ancestor we fall back to NULL_REVISION since we know that is safe.
+
+        :return: (basis_id, basis_tree)
+        """
+        first_rev = self.source.get_revision(first_revision_id)
+        try:
+            basis_id = first_rev.parent_ids[0]
+            # only valid as a basis if the target has it
+            self.target.get_revision(basis_id)
+            # Try to get a basis tree - if its a ghost it will hit the
+            # NoSuchRevision case.
+            basis_tree = self.source.revision_tree(basis_id)
+        except (IndexError, errors.NoSuchRevision):
+            basis_id = _mod_revision.NULL_REVISION
+            basis_tree = self.source.revision_tree(basis_id)
+        return basis_id, basis_tree
+
 
 class InterOtherToRemote(InterRepository):
     """An InterRepository that simply delegates to the 'real' InterRepository

=== modified file 'bzrlib/tests/interrepository_implementations/__init__.py'
--- a/bzrlib/tests/interrepository_implementations/__init__.py	2008-04-30 20:09:39 +0000
+++ b/bzrlib/tests/interrepository_implementations/__init__.py	2008-12-05 22:18:47 +0000
@@ -122,6 +122,9 @@
         result.append((InterKnitRepo,
                        pack_repo.RepositoryFormatKnitPack3(),
                        knitrepo.RepositoryFormatKnit3()))
+        result.append((InterKnitRepo,
+                       pack_repo.RepositoryFormatKnitPack3(),
+                       pack_repo.RepositoryFormatKnitPack4()))
         return result
 
 

=== modified file 'bzrlib/tests/per_repository/__init__.py'
--- a/bzrlib/tests/per_repository/__init__.py	2008-11-12 03:56:51 +0000
+++ b/bzrlib/tests/per_repository/__init__.py	2008-12-05 16:07:04 +0000
@@ -859,6 +859,7 @@
     prefix = 'bzrlib.tests.per_repository.'
     test_repository_modules = [
         'test_add_fallback_repository',
+        'test_add_inventory_by_delta',
         'test_break_lock',
         'test_check',
         # test_check_reconcile is intentionally omitted, see below.

=== added file 'bzrlib/tests/per_repository/test_add_inventory_by_delta.py'
--- a/bzrlib/tests/per_repository/test_add_inventory_by_delta.py	1970-01-01 00:00:00 +0000
+++ b/bzrlib/tests/per_repository/test_add_inventory_by_delta.py	2008-12-07 17:33:03 +0000
@@ -0,0 +1,90 @@
+# Copyright (C) 2008 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+
+"""Tests for Repository.add_inventory_by_delta."""
+
+from bzrlib import errors, revision
+from bzrlib.tests.per_repository import TestCaseWithRepository
+
+
+class TestAddInventoryByDelta(TestCaseWithRepository):
+
+    def _get_repo_in_write_group(self, path='repository'):
+        repo = self.make_repository(path)
+        repo.lock_write()
+        self.addCleanup(repo.unlock)
+        repo.start_write_group()
+        return repo
+
+    def test_basis_missing_errors(self):
+        repo = self._get_repo_in_write_group()
+        try:
+            self.assertRaises(errors.NoSuchRevision,
+                repo.add_inventory_by_delta, "missing-revision", [],
+                "new-revision", ["missing-revision"])
+        finally:
+            repo.abort_write_group()
+
+    def test_not_in_write_group_errors(self):
+        repo = self.make_repository('repository')
+        repo.lock_write()
+        self.addCleanup(repo.unlock)
+        self.assertRaises(AssertionError, repo.add_inventory_by_delta,
+            "missing-revision", [], "new-revision", ["missing-revision"])
+
+    def make_inv_delta(self, old, new):
+        """Make an inventory delta from two inventories."""
+        old_ids = set(old._byid.iterkeys())
+        new_ids = set(new._byid.iterkeys())
+        adds = new_ids - old_ids
+        deletes = old_ids - new_ids
+        common = old_ids.intersection(new_ids)
+        delta = []
+        for file_id in deletes:
+            delta.append((old.id2path(file_id), None, file_id, None))
+        for file_id in adds:
+            delta.append((None, new.id2path(file_id), file_id, new[file_id]))
+        for file_id in common:
+            if old[file_id] != new[file_id]:
+                delta.append((old.id2path(file_id), new.id2path(file_id),
+                    file_id, new[file_id]))
+        return delta
+
+    def test_same_validator(self):
+        # Adding an inventory via delta or direct results in the same
+        # validator.
+        tree = self.make_branch_and_tree('tree')
+        revid = tree.commit("empty post")
+        revtree = tree.basis_tree()
+        revtree.lock_read()
+        self.addCleanup(revtree.unlock)
+        new_inv = revtree.inventory
+        delta = self.make_inv_delta(
+            tree.branch.repository.revision_tree(revision.NULL_REVISION).inventory,
+            new_inv)
+        repo_direct = self._get_repo_in_write_group('direct')
+        add_validator = repo_direct.add_inventory(revid, new_inv, [])
+        repo_direct.commit_write_group()
+        repo_delta = self._get_repo_in_write_group('delta')
+        try:
+            delta_validator, inv = repo_delta.add_inventory_by_delta(
+                revision.NULL_REVISION, delta, revid, [])
+        except:
+            repo_delta.abort_write_group()
+            raise
+        else:
+            repo_delta.commit_write_group()
+        self.assertEqual(add_validator, delta_validator)

=== modified file 'bzrlib/tests/per_repository/test_commit_builder.py'
--- a/bzrlib/tests/per_repository/test_commit_builder.py	2008-11-12 07:38:57 +0000
+++ b/bzrlib/tests/per_repository/test_commit_builder.py	2008-12-05 17:25:01 +0000
@@ -147,7 +147,7 @@
         parent_tree = tree.basis_tree()
         parent_tree.lock_read()
         self.addCleanup(parent_tree.unlock)
-        builder = tree.branch.get_commit_builder([parent_tree.inventory])
+        builder = tree.branch.get_commit_builder([old_revision_id])
         try:
             ie = inventory.make_entry('directory', '', None,
                     tree.get_root_id())
@@ -159,9 +159,9 @@
             # should be in the delta
             got_new_revision = ie.revision != old_revision_id
             if got_new_revision:
-                self.assertEqual(
-                    ('', '', ie.file_id, ie),
-                    delta)
+                self.assertEqual(('', '', ie.file_id, ie), delta)
+                # The delta should be tracked
+                self.assertEqual(delta, builder._basis_delta[-1])
             else:
                 self.assertEqual(None, delta)
             # Directories do not get hashed.
@@ -191,6 +191,115 @@
         # but thats all the current contract guarantees anyway.
         self.assertEqual(rev_id, tree.branch.repository.get_inventory(rev_id).revision_id)
 
+    def test_get_basis_delta(self):
+        tree = self.make_branch_and_tree(".")
+        self.build_tree(["foo"])
+        tree.add(["foo"], ["foo-id"])
+        old_revision_id = tree.commit("added foo")
+        tree.lock_write()
+        try:
+            self.build_tree(['bar'])
+            tree.add(['bar'], ['bar-id'])
+            basis = tree.branch.repository.revision_tree(old_revision_id)
+            basis.lock_read()
+            self.addCleanup(basis.unlock)
+            builder = tree.branch.get_commit_builder([old_revision_id])
+            total_delta = []
+            try:
+                parent_invs = [basis.inventory]
+                builder.will_record_deletes()
+                if builder.record_root_entry:
+                    ie = basis.inventory.root.copy()
+                    delta, _, _ = builder.record_entry_contents(ie, parent_invs,
+                        '', tree, tree.path_content_summary(''))
+                    if delta is not None:
+                        total_delta.append(delta)
+                delta = builder.record_delete("foo", "foo-id")
+                total_delta.append(delta)
+                new_bar = inventory.make_entry('file', 'bar',
+                    parent_id=tree.get_root_id(), file_id='bar-id')
+                delta, _, _ = builder.record_entry_contents(new_bar, parent_invs,
+                    'bar', tree, tree.path_content_summary('bar'))
+                total_delta.append(delta)
+                # All actions should have been recorded in the basis_delta
+                self.assertEqual(total_delta, builder.get_basis_delta())
+                builder.finish_inventory()
+                builder.commit('delete foo, add bar')
+            except:
+                tree.branch.repository.abort_write_group()
+                raise
+        finally:
+            tree.unlock()
+
+    def test_get_basis_delta_without_notification(self):
+        tree = self.make_branch_and_tree(".")
+        old_revision_id = tree.commit('')
+        tree.lock_write()
+        try:
+            parent_tree = tree.basis_tree()
+            parent_tree.lock_read()
+            self.addCleanup(parent_tree.unlock)
+            builder = tree.branch.get_commit_builder([old_revision_id])
+            # It is an error to expect builder.get_basis_delta() to be correct,
+            # if you have not also called will_record_deletes() to indicate you
+            # will be calling record_delete() when appropriate
+            self.assertRaises(AssertionError, builder.get_basis_delta)
+            tree.branch.repository.abort_write_group()
+        finally:
+            tree.unlock()
+
+    def test_record_delete(self):
+        tree = self.make_branch_and_tree(".")
+        self.build_tree(["foo"])
+        tree.add(["foo"], ["foo-id"])
+        rev_id = tree.commit("added foo")
+        # Remove the inventory details for foo-id, because
+        # record_entry_contents ends up copying root verbatim.
+        tree.unversion(["foo-id"])
+        tree.lock_write()
+        try:
+            basis = tree.branch.repository.revision_tree(rev_id)
+            builder = tree.branch.get_commit_builder([rev_id])
+            try:
+                builder.will_record_deletes()
+                if builder.record_root_entry is True:
+                    parent_invs = [basis.inventory]
+                    del basis.inventory.root.children['foo']
+                    builder.record_entry_contents(basis.inventory.root,
+                        parent_invs, '', tree, tree.path_content_summary(''))
+                # the delta should be returned, and recorded in _basis_delta
+                delta = builder.record_delete("foo", "foo-id")
+                self.assertEqual(("foo", None, "foo-id", None), delta)
+                self.assertEqual(delta, builder._basis_delta[-1])
+                builder.finish_inventory()
+                rev_id2 = builder.commit('delete foo')
+            except:
+                tree.branch.repository.abort_write_group()
+                raise
+        finally:
+            tree.unlock()
+        rev_tree = builder.revision_tree()
+        rev_tree.lock_read()
+        self.addCleanup(rev_tree.unlock)
+        self.assertFalse(rev_tree.path2id('foo'))
+
+    def test_record_delete_without_notification(self):
+        tree = self.make_branch_and_tree(".")
+        self.build_tree(["foo"])
+        tree.add(["foo"], ["foo-id"])
+        rev_id = tree.commit("added foo")
+        tree.lock_write()
+        try:
+            builder = tree.branch.get_commit_builder([rev_id])
+            try:
+                self.record_root(builder, tree)
+                self.assertRaises(AssertionError,
+                    builder.record_delete, "foo", "foo-id")
+            finally:
+                tree.branch.repository.abort_write_group()
+        finally:
+            tree.unlock()
+
     def test_revision_tree(self):
         tree = self.make_branch_and_tree(".")
         tree.lock_write()
@@ -365,7 +474,7 @@
     def mini_commit(self, tree, name, new_name, records_version=True,
         delta_against_basis=True, expect_fs_hash=False):
         """Perform a miniature commit looking for record entry results.
-        
+
         :param tree: The tree to commit.
         :param name: The path in the basis tree of the tree being committed.
         :param new_name: The path in the tree being committed.
@@ -424,6 +533,8 @@
             new_entry = builder.new_inventory[file_id]
             if delta_against_basis:
                 expected_delta = (name, new_name, file_id, new_entry)
+                # The delta should be recorded
+                self.assertEqual(expected_delta, builder._basis_delta[-1])
             else:
                 expected_delta = None
             self.assertEqual(expected_delta, delta)

=== modified file 'bzrlib/tests/per_repository/test_repository.py'
--- a/bzrlib/tests/per_repository/test_repository.py	2008-12-01 19:07:21 +0000
+++ b/bzrlib/tests/per_repository/test_repository.py	2008-12-10 03:21:33 +0000
@@ -1008,6 +1008,8 @@
         try:
             self.assertRaises(errors.ReservedId, repo.add_inventory, 'reserved:',
                               None, None)
+            self.assertRaises(errors.ReservedId, repo.add_inventory_by_delta,
+                "foo", [], 'reserved:', None)
             self.assertRaises(errors.ReservedId, repo.add_revision, 'reserved:',
                               None)
         finally:




More information about the bazaar-commits mailing list