Rev 3172: (robertc) Add Repository.iter_inventories reducing code duplication. in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Thu Jan 10 02:56:34 GMT 2008


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 3172
revision-id:pqm at pqm.ubuntu.com-20080110025628-6tl4b9cmdn335suw
parent: pqm at pqm.ubuntu.com-20080110011547-97smthgbb8hfshs7
parent: robertc at robertcollins.net-20080110014517-k02chl1xfz8ftdqa
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Thu 2008-01-10 02:56:28 +0000
message:
  (robertc) Add Repository.iter_inventories reducing code duplication.
  	(Robert Collins)
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
  bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
  bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
  bzrlib/tests/interrepository_implementations/test_interrepository.py test_interrepository.py-20060220061411-1ec13fa99e5e3eee
  bzrlib/tests/repository_implementations/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
  bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
  bzrlib/xml_serializer.py       xml.py-20050309040759-57d51586fdec365d
    ------------------------------------------------------------
    revno: 3169.2.4
    revision-id:robertc at robertcollins.net-20080110014517-k02chl1xfz8ftdqa
    parent: robertc at robertcollins.net-20080110013010-s0eei50kbu2jg6tt
    parent: pqm at pqm.ubuntu.com-20080110011547-97smthgbb8hfshs7
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: integration
    timestamp: Thu 2008-01-10 12:45:17 +1100
    message:
      Merge bzr.dev to deal with reindenting in NEWS.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/debug.py                debug.py-20061102062349-vdhrw9qdpck8cl35-1
      bzrlib/help_topics/__init__.py help_topics.py-20060920210027-rnim90q9e0bwxvy4-1
      bzrlib/tests/TestUtil.py       TestUtil.py-20050824080200-5f70140a2d938694
      bzrlib/tests/test_osutils.py   test_osutils.py-20051201224856-e48ee24c12182989
      bzrlib/trace.py                trace.py-20050309040759-c8ed824bdcd4748a
    ------------------------------------------------------------
    revno: 3169.2.3
    revision-id:robertc at robertcollins.net-20080110013010-s0eei50kbu2jg6tt
    parent: robertc at robertcollins.net-20080109005101-mmpiihes7sw3uzr5
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: iter_inventories
    timestamp: Thu 2008-01-10 12:30:10 +1100
    message:
      Use an if, not an assert, as we test with -O.
    modified:
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3169.2.2
    revision-id:robertc at robertcollins.net-20080109005101-mmpiihes7sw3uzr5
    parent: robertc at robertcollins.net-20080107012738-p74oqa65zc0z2xrr
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: iter_inventories
    timestamp: Wed 2008-01-09 11:51:01 +1100
    message:
      Add a test to Repository.deserialise_inventory that the resulting ivnentory is the one asked for, and update relevant tests. Also tweak the model 1 to 2 regenerate inventories logic to use the revision trees parent marker which is more accurate in some cases.
    modified:
      bzrlib/bzrdir.py               bzrdir.py-20060131065624-156dfea39c4387cb
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/interrepository_implementations/test_interrepository.py test_interrepository.py-20060220061411-1ec13fa99e5e3eee
      bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
      bzrlib/xml_serializer.py       xml.py-20050309040759-57d51586fdec365d
    ------------------------------------------------------------
    revno: 3169.2.1
    revision-id:robertc at robertcollins.net-20080107012738-p74oqa65zc0z2xrr
    parent: pqm at pqm.ubuntu.com-20080105015401-67wgbytv81394cl1
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: iter_inventories
    timestamp: Mon 2008-01-07 12:27:38 +1100
    message:
      New method ``iter_inventories`` on Repository for access to many
      inventories. This is primarily used by the ``revision_trees`` method, as
      direct access to inventories is discouraged. (Robert Collins)
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/tests/repository_implementations/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
=== modified file 'NEWS'
--- a/NEWS	2008-01-09 07:04:23 +0000
+++ b/NEWS	2008-01-10 01:45:17 +0000
@@ -143,7 +143,7 @@
 
    * Add -Dtimes debug flag, which records a timestamp against each mutter to
      the trace file, relative to the first mutter.  (Andrew Bennetts)
-    
+   
    * find_* methods available for BzrDirs, Branches and WorkingTrees.
      (Aaron Bentley)
 
@@ -152,6 +152,10 @@
 
    * get_parent_map now always provides tuples as its output.  (Aaron Bentley)
 
+   * New method ``iter_inventories`` on Repository for access to many
+     inventories. This is primarily used by the ``revision_trees`` method, as
+     direct access to inventories is discouraged. (Robert Collins)
+
    * Parent Providers should now implement ``get_parent_map`` returning a
      dictionary instead of ``get_parents`` returning a list.
      ``get_parents`` is now considered deprecated.  (John Arbash Meinel)

=== modified file 'bzrlib/bzrdir.py'
--- a/bzrlib/bzrdir.py	2007-12-22 00:03:14 +0000
+++ b/bzrlib/bzrdir.py	2008-01-09 00:51:01 +0000
@@ -2040,7 +2040,7 @@
     def _load_updated_inventory(self, rev_id):
         assert rev_id in self.converted_revs
         inv_xml = self.inv_weave.get_text(rev_id)
-        inv = xml5.serializer_v5.read_inventory_from_string(inv_xml)
+        inv = xml5.serializer_v5.read_inventory_from_string(inv_xml, rev_id)
         return inv
 
     def _convert_one_rev(self, rev_id):

=== modified file 'bzrlib/fetch.py'
--- a/bzrlib/fetch.py	2007-11-21 03:35:07 +0000
+++ b/bzrlib/fetch.py	2008-01-09 00:51:01 +0000
@@ -362,9 +362,8 @@
         stored in the target (reserializing it in a different format).
         :param revs: The revisions to include
         """
-        inventory_weave = self.source.get_inventory_weave()
         for tree in self.iter_rev_trees(revs):
-            parents = inventory_weave.get_parents(tree.get_revision_id())
+            parents = tree.get_parent_ids()
             self.target.add_inventory(tree.get_revision_id(), tree.inventory,
                                       parents)
 
@@ -372,7 +371,7 @@
 class Model1toKnit2Fetcher(GenericRepoFetcher):
     """Fetch from a Model1 repository into a Knit2 repository
     """
-    def __init__(self, to_repository, from_repository, last_revision=None, 
+    def __init__(self, to_repository, from_repository, last_revision=None,
                  pb=None):
         self.helper = Inter1and2Helper(from_repository, to_repository)
         GenericRepoFetcher.__init__(self, to_repository, from_repository,

=== modified file 'bzrlib/remote.py'
--- a/bzrlib/remote.py	2007-12-21 19:56:30 +0000
+++ b/bzrlib/remote.py	2008-01-07 01:27:38 +0000
@@ -644,6 +644,10 @@
         self._ensure_real()
         return self._real_repository.get_inventory(revision_id)
 
+    def iter_inventories(self, revision_ids):
+        self._ensure_real()
+        return self._real_repository.iter_inventories(revision_ids)
+
     @needs_read_lock
     def get_revision(self, revision_id):
         self._ensure_real()

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2008-01-04 04:57:47 +0000
+++ b/bzrlib/repository.py	2008-01-10 01:30:10 +0000
@@ -502,7 +502,8 @@
         :param parents: The revision ids of the parents that revision_id
                         is known to have and are in the repository already.
 
-        returns the sha1 of the serialized inventory.
+        :returns: The validator(which is a sha1 digest, though what is sha'd is
+            repository format specific) of the serialized inventory.
         """
         assert self.is_in_write_group()
         _mod_revision.check_not_reserved_id(revision_id)
@@ -1450,9 +1451,27 @@
 
     @needs_read_lock
     def get_inventory(self, revision_id):
-        """Get Inventory object by hash."""
-        return self.deserialise_inventory(
-            revision_id, self.get_inventory_xml(revision_id))
+        """Get Inventory object by revision id."""
+        return self.iter_inventories([revision_id]).next()
+
+    def iter_inventories(self, revision_ids):
+        """Get many inventories by revision_ids.
+
+        This will buffer some or all of the texts used in constructing the
+        inventories in memory, but will only parse a single inventory at a
+        time.
+
+        :return: An iterator of inventories.
+        """
+        assert None not in revision_ids
+        assert _mod_revision.NULL_REVISION not in revision_ids
+        return self._iter_inventories(revision_ids)
+
+    def _iter_inventories(self, revision_ids):
+        """single-document based inventory iteration."""
+        texts = self.get_inventory_weave().get_texts(revision_ids)
+        for text, revision_id in zip(texts, revision_ids):
+            yield self.deserialise_inventory(revision_id, text)
 
     def deserialise_inventory(self, revision_id, xml):
         """Transform the xml into an inventory object. 
@@ -1460,7 +1479,11 @@
         :param revision_id: The expected revision id of the inventory.
         :param xml: A serialised inventory.
         """
-        return self._serializer.read_inventory_from_string(xml, revision_id)
+        result = self._serializer.read_inventory_from_string(xml, revision_id)
+        if result.revision_id != revision_id:
+            raise AssertionError('revision id mismatch %s != %s' % (
+                result.revision_id, revision_id))
+        return result
 
     def serialise_inventory(self, inv):
         return self._serializer.write_inventory_to_string(inv)
@@ -1626,12 +1649,9 @@
         """Return Tree for a revision on this branch.
 
         `revision_id` may not be None or 'null:'"""
-        assert None not in revision_ids
-        assert _mod_revision.NULL_REVISION not in revision_ids
-        texts = self.get_inventory_weave().get_texts(revision_ids)
-        for text, revision_id in zip(texts, revision_ids):
-            inv = self.deserialise_inventory(revision_id, text)
-            yield RevisionTree(self, inv, revision_id)
+        inventories = self.iter_inventories(revision_ids)
+        for inv in inventories:
+            yield RevisionTree(self, inv, inv.revision_id)
 
     @needs_read_lock
     def get_ancestry(self, revision_id, topo_sorted=True):

=== modified file 'bzrlib/tests/interrepository_implementations/test_interrepository.py'
--- a/bzrlib/tests/interrepository_implementations/test_interrepository.py	2007-11-30 22:13:29 +0000
+++ b/bzrlib/tests/interrepository_implementations/test_interrepository.py	2008-01-09 00:51:01 +0000
@@ -262,24 +262,6 @@
         to_repo = self.make_to_repository('to')
         to_repo.fetch(from_tree.branch.repository, from_tree.get_parent_ids()[0])
 
-    def test_fetch_no_inventory_revision(self):
-        """Old inventories lack revision_ids, so simulate this"""
-        from_tree = self.make_branch_and_tree('tree')
-        if sys.platform == 'win32':
-            from_repo = from_tree.branch.repository
-            check_repo_format_for_funky_id_on_win32(from_repo)
-        self.build_tree(['tree/filename'])
-        from_tree.add('filename', 'funky-chars<>%&;"\'')
-        from_tree.commit('commit filename')
-        old_deserialise = from_tree.branch.repository.deserialise_inventory
-        def deserialise(revision_id, text):
-            inventory = old_deserialise(revision_id, text)
-            inventory.revision_id = None
-            return inventory
-        from_tree.branch.repository.deserialise_inventory = deserialise
-        to_repo = self.make_to_repository('to')
-        to_repo.fetch(from_tree.branch.repository, from_tree.last_revision())
-
 
 class TestCaseWithComplexRepository(TestCaseWithInterRepository):
 

=== modified file 'bzrlib/tests/repository_implementations/test_repository.py'
--- a/bzrlib/tests/repository_implementations/test_repository.py	2008-01-02 15:49:06 +0000
+++ b/bzrlib/tests/repository_implementations/test_repository.py	2008-01-07 01:27:38 +0000
@@ -77,6 +77,19 @@
         tree_b.get_file_text('file1')
         rev1 = repo_b.get_revision('rev1')
 
+    def test_iter_inventories_is_ordered(self):
+        # just a smoke test
+        tree = self.make_branch_and_tree('a')
+        first_revision = tree.commit('')
+        second_revision = tree.commit('')
+        tree.lock_read()
+        self.addCleanup(tree.unlock)
+        revs = (first_revision, second_revision)
+        invs = tree.branch.repository.iter_inventories(revs)
+        for rev_id, inv in zip(revs, invs):
+            self.assertEqual(rev_id, inv.revision_id)
+            self.assertIsInstance(inv, Inventory)
+
     def test_supports_rich_root(self):
         tree = self.make_branch_and_tree('a')
         tree.commit('')

=== modified file 'bzrlib/tests/test_repository.py'
--- a/bzrlib/tests/test_repository.py	2007-12-30 18:20:15 +0000
+++ b/bzrlib/tests/test_repository.py	2008-01-09 00:51:01 +0000
@@ -398,7 +398,9 @@
         # Arguably, the deserialise_inventory should detect a mismatch, and
         # raise an error, rather than silently using one revision_id over the
         # other.
-        inv = repo.deserialise_inventory('test-rev-id', inv_xml)
+        self.assertRaises(AssertionError, repo.deserialise_inventory,
+            'test-rev-id', inv_xml)
+        inv = repo.deserialise_inventory('other-rev-id', inv_xml)
         self.assertEqual('other-rev-id', inv.root.revision)
 
 

=== modified file 'bzrlib/xml_serializer.py'
--- a/bzrlib/xml_serializer.py	2007-10-05 02:41:37 +0000
+++ b/bzrlib/xml_serializer.py	2008-01-09 00:51:01 +0000
@@ -65,7 +65,11 @@
         :param xml_string: The xml to read.
         :param revision_id: If not-None, the expected revision id of the
             inventory. Some serialisers use this to set the results' root
-            revision.
+            revision. This should be supplied for deserialising all
+            from-repository inventories so that xml5 inventories that were
+            serialised without a revision identifier can be given the right
+            revision id (but not for working tree inventories where users can
+            edit the data without triggering checksum errors or anything).
         """
         try:
             return self._unpack_inventory(fromstring(xml_string), revision_id)




More information about the bazaar-commits mailing list