Rev 3758: Merge improvements to fetch from Andrew. in http://people.ubuntu.com/~robertc/baz2.0/repository

Robert Collins robertc at robertcollins.net
Tue Nov 11 10:02:59 GMT 2008


At http://people.ubuntu.com/~robertc/baz2.0/repository

------------------------------------------------------------
revno: 3758
revision-id: robertc at robertcollins.net-20081111100253-hx2ndctrnwilr62i
parent: robertc at robertcollins.net-20081111060249-6xs3p8czmy5ipkkp
parent: andrew.bennetts at canonical.com-20081111043629-ojx8u4wob9kwuatt
committer: Robert Collins <robertc at robertcollins.net>
branch nick: repository
timestamp: Tue 2008-11-11 21:02:53 +1100
message:
  Merge improvements to fetch from Andrew.
modified:
  bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
  bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
  bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
    ------------------------------------------------------------
    revno: 3756.1.5
    revision-id: andrew.bennetts at canonical.com-20081111043629-ojx8u4wob9kwuatt
    parent: andrew.bennetts at canonical.com-20081111042843-7z2eipckbjp4uf8j
    committer: Andrew Bennetts <andrew.bennetts at canonical.com>
    branch nick: chk
    timestamp: Tue 2008-11-11 14:36:29 +1000
    message:
      Remove print statement.
    modified:
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
    ------------------------------------------------------------
    revno: 3756.1.4
    revision-id: andrew.bennetts at canonical.com-20081111042843-7z2eipckbjp4uf8j
    parent: andrew.bennetts at canonical.com-20081111021412-qtdfci313s63r252
    committer: Andrew Bennetts <andrew.bennetts at canonical.com>
    branch nick: chk
    timestamp: Tue 2008-11-11 14:28:43 +1000
    message:
      Change the layering, to put the custom file_id list underneath item_keys_intoduced_by
    modified:
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3756.1.3
    revision-id: andrew.bennetts at canonical.com-20081111021412-qtdfci313s63r252
    parent: andrew.bennetts at canonical.com-20081111000537-afdxn3ik67zoah9d
    committer: Andrew Bennetts <andrew.bennetts at canonical.com>
    branch nick: chk
    timestamp: Tue 2008-11-11 12:14:12 +1000
    message:
      Add _find_keys_to_fetch, and use it instead of item_keys_introduced_by when available.
    modified:
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3756.1.2
    revision-id: andrew.bennetts at canonical.com-20081111000537-afdxn3ik67zoah9d
    parent: andrew.bennetts at canonical.com-20081111000317-ufw8r6drmobmqtyn
    committer: Andrew Bennetts <andrew.bennetts at canonical.com>
    branch nick: chk
    timestamp: Tue 2008-11-11 10:05:37 +1000
    message:
      Add fix, comment.
    modified:
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
    ------------------------------------------------------------
    revno: 3756.1.1
    revision-id: andrew.bennetts at canonical.com-20081111000317-ufw8r6drmobmqtyn
    parent: robertc at robertcollins.net-20081106231431-km10poyn95ifnjkh
    committer: Andrew Bennetts <andrew.bennetts at canonical.com>
    branch nick: chk
    timestamp: Tue 2008-11-11 10:03:17 +1000
    message:
      Add _find_revision_outside_set.
    modified:
      bzrlib/repofmt/pack_repo.py    pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
      bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
=== modified file 'bzrlib/fetch.py'
--- a/bzrlib/fetch.py	2008-10-14 04:33:32 +0000
+++ b/bzrlib/fetch.py	2008-11-11 04:28:43 +0000
@@ -157,11 +157,10 @@
         phase = 'file'
         pb = bzrlib.ui.ui_factory.nested_progress_bar()
         try:
+            from_repo = self.from_repository
             revs = search.get_keys()
-            graph = self.from_repository.get_graph()
-            revs = list(graph.iter_topo_order(revs))
-            data_to_fetch = self.from_repository.item_keys_introduced_by(revs,
-                                                                         pb)
+            revs = list(from_repo.get_graph().iter_topo_order(revs))
+            data_to_fetch = from_repo.item_keys_introduced_by(revs, pb)
             text_keys = []
             for knit_kind, file_id, revisions in data_to_fetch:
                 if knit_kind != phase:
@@ -189,16 +188,17 @@
                     # Before we process the inventory we generate the root
                     # texts (if necessary) so that the inventories references
                     # will be valid.
-                    self._generate_root_texts(revs)
+                    self._generate_root_texts(revisions)
                     # NB: This currently reopens the inventory weave in source;
                     # using a single stream interface instead would avoid this.
-                    self._fetch_inventory_weave(revs, pb)
+                    self._fetch_inventory_weave(revisions, pb)
                 elif knit_kind == "signatures":
                     # Nothing to do here; this will be taken care of when
                     # _fetch_revision_texts happens.
                     pass
                 elif knit_kind == "revisions":
-                    self._fetch_revision_texts(revs, pb)
+                    self._fetch_revision_texts(revisions, pb)
+                    self.count_copied += len(revisions)
                 else:
                     raise AssertionError("Unknown knit kind %r" % knit_kind)
             if self.to_repository._fetch_reconcile:
@@ -206,7 +206,6 @@
         finally:
             if pb is not None:
                 pb.finished()
-        self.count_copied += len(revs)
         
     def _revids_to_fetch(self):
         """Determines the exact revisions needed from self.from_repository to

=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py	2008-11-06 23:00:22 +0000
+++ b/bzrlib/repofmt/pack_repo.py	2008-11-11 04:28:43 +0000
@@ -53,6 +53,7 @@
     errors,
     lockable_files,
     lockdir,
+    revision as _mod_revision,
     symbol_versioning,
     )
 
@@ -2163,6 +2164,49 @@
         # make it raise to trap naughty direct users.
         raise NotImplementedError(self._iter_inventory_xmls)
 
+    def _find_revision_outside_set(self, revision_ids):
+        revision_set = frozenset(revision_ids)
+        for revid in revision_ids:
+            parent_ids = self.get_parent_map([revid]).get(revid, ())
+            for parent in parent_ids:
+                if parent in revision_set:
+                    # Parent is not outside the set
+                    continue
+                if parent not in self.get_parent_map([parent]):
+                    # Parent is a ghost
+                    continue
+                return parent
+        return _mod_revision.NULL_REVISION
+
+    def _find_file_keys_to_fetch(self, revision_ids, pb):
+        rich_root = self.supports_rich_root()
+        revision_outside_set = self._find_revision_outside_set(revision_ids)
+        if revision_outside_set == _mod_revision.NULL_REVISION:
+            uninteresting_chk_refs = set()
+        else:
+            uninteresting_inv = self.get_inventory(revision_outside_set)
+            uninteresting_map = uninteresting_inv.id_to_entry
+            uninteresting_map._ensure_root()
+            uninteresting_chk_refs = set(uninteresting_map._root_node.refs())
+        for idx, inv in enumerate(self.iter_inventories(revision_ids)):
+            pb.update('fetch', idx, len(revision_ids))
+            inv_chk_map = inv.id_to_entry
+            inv_chk_map._ensure_root()
+            candidate_names = {}
+            for name, ref in inv_chk_map._root_node._nodes.iteritems():
+                if ref in uninteresting_chk_refs:
+                    continue
+                candidate_names[name] = ref
+            for name, bytes in inv_chk_map.iteritems(candidate_names):
+                entry = inv._bytes_to_entry(bytes)
+                if entry.name == '' and not rich_root:
+                    continue
+                if entry.revision == inv.revision_id:
+                    # add it to uninteresting_chk_refs to
+                    # avoid processing twice it if we see it again later.
+                    uninteresting_chk_refs.add(candidate_names[name])
+                    yield ("file", entry.file_id, [entry.revision])
+        
     def fileids_altered_by_revision_ids(self, revision_ids, _inv_weave=None):
         """Find the file ids and versions affected by revisions.
 

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2008-11-06 23:00:22 +0000
+++ b/bzrlib/repository.py	2008-11-11 04:28:43 +0000
@@ -1600,6 +1600,13 @@
             versions).  knit-kind is one of 'file', 'inventory', 'signatures',
             'revisions'.  file-id is None unless knit-kind is 'file'.
         """
+        for result in self._find_file_keys_to_fetch(revision_ids, _files_pb):
+            yield result
+        del _files_pb
+        for result in self._find_non_file_keys_to_fetch(revision_ids):
+            yield result
+
+    def _find_file_keys_to_fetch(self, revision_ids, pb):
         # XXX: it's a bit weird to control the inventory weave caching in this
         # generator.  Ideally the caching would be done in fetch.py I think.  Or
         # maybe this generator should explicitly have the contract that it
@@ -1612,14 +1619,12 @@
         count = 0
         num_file_ids = len(file_ids)
         for file_id, altered_versions in file_ids.iteritems():
-            if _files_pb is not None:
-                _files_pb.update("fetch texts", count, num_file_ids)
+            if pb is not None:
+                pb.update("fetch texts", count, num_file_ids)
             count += 1
             yield ("file", file_id, altered_versions)
-        # We're done with the files_pb.  Note that it finished by the caller,
-        # just as it was created by the caller.
-        del _files_pb
 
+    def _find_non_file_keys_to_fetch(self, revision_ids):
         # inventory
         yield ("inventory", None, revision_ids)
 

=== modified file 'bzrlib/tests/test_repository.py'
--- a/bzrlib/tests/test_repository.py	2008-11-06 23:00:22 +0000
+++ b/bzrlib/tests/test_repository.py	2008-11-11 00:03:17 +0000
@@ -32,6 +32,7 @@
                            UnsupportedFormatError,
                            )
 from bzrlib import graph
+from bzrlib.branchbuilder import BranchBuilder
 from bzrlib.btree_index import BTreeBuilder, BTreeGraphIndex
 from bzrlib.index import GraphIndex, InMemoryGraphIndex
 from bzrlib.repository import RepositoryFormat
@@ -687,6 +688,64 @@
             repo.chk_bytes.keys())
 
 
+class TestDevelopment3FindRevisionOutsideSet(TestCaseWithTransport):
+    """Tests for _find_revision_outside_set."""
+
+    def setUp(self):
+        super(TestDevelopment3FindRevisionOutsideSet, self).setUp()
+        self.builder = self.make_branch_builder('source', format='development3')
+        self.builder.start_series()
+        self.builder.build_snapshot('initial', None,
+            [('add', ('', 'tree-root', 'directory', None))])
+        self.repo = self.builder.get_branch().repository
+        self.addCleanup(self.builder.finish_series)
+        
+    def assertRevisionOutsideSet(self, expected_result, rev_set):
+        self.assertEqual(
+            expected_result, self.repo._find_revision_outside_set(rev_set))
+
+    def test_simple(self):
+        self.builder.build_snapshot('revid1', None, [])
+        self.builder.build_snapshot('revid2', None, [])
+        rev_set = ['revid2']
+        self.assertRevisionOutsideSet('revid1', rev_set)
+
+    def test_not_first_parent(self):
+        self.builder.build_snapshot('revid1', None, [])
+        self.builder.build_snapshot('revid2', None, [])
+        self.builder.build_snapshot('revid3', None, [])
+        rev_set = ['revid3', 'revid2']
+        self.assertRevisionOutsideSet('revid1', rev_set)
+
+    def test_not_null(self):
+        rev_set = ['initial']
+        self.assertRevisionOutsideSet(_mod_revision.NULL_REVISION, rev_set)
+
+    def test_not_null_set(self):
+        self.builder.build_snapshot('revid1', None, [])
+        rev_set = [_mod_revision.NULL_REVISION]
+        self.assertRevisionOutsideSet(_mod_revision.NULL_REVISION, rev_set)
+
+    def test_ghost(self):
+        self.builder.build_snapshot('revid1', None, [])
+        rev_set = ['ghost', 'revid1']
+        self.assertRevisionOutsideSet('initial', rev_set)
+
+    def test_ghost_parent(self):
+        self.builder.build_snapshot('revid1', None, [])
+        self.builder.build_snapshot('revid2', ['revid1', 'ghost'], [])
+        rev_set = ['revid2', 'revid1']
+        self.assertRevisionOutsideSet('initial', rev_set)
+
+    def test_righthand_parent(self):
+        self.builder.build_snapshot('revid1', None, [])
+        self.builder.build_snapshot('revid2a', ['revid1'], [])
+        self.builder.build_snapshot('revid2b', ['revid1'], [])
+        self.builder.build_snapshot('revid3', ['revid2a', 'revid2b'], [])
+        rev_set = ['revid3', 'revid2a']
+        self.assertRevisionOutsideSet('revid2b', rev_set)
+
+
 class TestWithBrokenRepo(TestCaseWithTransport):
     """These tests seem to be more appropriate as interface tests?"""
 




More information about the bazaar-commits mailing list