Rev 3182: (robertc) Stop fetching ghosts over the smart server pulls by in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Tue Jan 15 06:16:02 GMT 2008


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 3182
revision-id:pqm at pqm.ubuntu.com-20080115061554-qyfxjo4fxlsjobar
parent: pqm at pqm.ubuntu.com-20080115020129-jl22ugxkca1rox94
parent: robertc at robertcollins.net-20080114030008-xdf5xvub5prv2zal
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Tue 2008-01-15 06:15:54 +0000
message:
  (robertc) Stop fetching ghosts over the smart server pulls by
  	default. (Robert Collins)
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3172.3.1.1.4
    revision-id:robertc at robertcollins.net-20080114030008-xdf5xvub5prv2zal
    parent: robertc at robertcollins.net-20080114005859-0o83m18iul1xwy0v
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: more-find-ghosts
    timestamp: Mon 2008-01-14 14:00:08 +1100
    message:
      Review feedback.
    modified:
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
    ------------------------------------------------------------
    revno: 3172.3.1.1.3
    revision-id:robertc at robertcollins.net-20080114005859-0o83m18iul1xwy0v
    parent: robertc at robertcollins.net-20080114000613-3zal7v1to7k9xp5b
    parent: robertc at robertcollins.net-20080114005456-d4a5iief649hmtiy
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: more-find-ghosts
    timestamp: Mon 2008-01-14 11:58:59 +1100
    message:
      More graph support kthanxbye.
    modified:
      bzrlib/graph.py                graph_walker.py-20070525030359-y852guab65d4wtn0-1
      bzrlib/tests/test_graph.py     test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
    ------------------------------------------------------------
    revno: 3172.3.1.1.2
    revision-id:robertc at robertcollins.net-20080114000613-3zal7v1to7k9xp5b
    parent: robertc at robertcollins.net-20080111043302-0pi5csqyr1ugry00
    parent: robertc at robertcollins.net-20080113235717-9a1w22q93j81nd0o
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: more-find-ghosts
    timestamp: Mon 2008-01-14 11:06:13 +1100
    message:
      Merge next_with_ghosts support.
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/debug.py                debug.py-20061102062349-vdhrw9qdpck8cl35-1
      bzrlib/graph.py                graph_walker.py-20070525030359-y852guab65d4wtn0-1
      bzrlib/help_topics/__init__.py help_topics.py-20060920210027-rnim90q9e0bwxvy4-1
      bzrlib/knit.py                 knit.py-20051212171256-f056ac8f0fbe1bd9
      bzrlib/reconfigure.py          reconfigure.py-20070908040425-6ykgo7escxhyrg9p-1
      bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
      bzrlib/smart/client.py         client.py-20061116014825-2k6ada6xgulslami-1
      bzrlib/smart/repository.py     repository.py-20061128022038-vr5wy5bubyb8xttk-1
      bzrlib/smart/request.py        request.py-20061108095550-gunadhxmzkdjfeek-1
      bzrlib/tests/test_graph.py     test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
      bzrlib/tests/test_osutils.py   test_osutils.py-20051201224856-e48ee24c12182989
      bzrlib/tests/test_reconfigure.py test_reconfigure.py-20070908040425-6ykgo7escxhyrg9p-2
      bzrlib/tests/test_remote.py    test_remote.py-20060720103555-yeeg2x51vn0rbtdp-2
      bzrlib/tests/test_smart.py     test_smart.py-20061122024551-ol0l0o0oofsu9b3t-2
      bzrlib/tests/test_smart_transport.py test_ssh_transport.py-20060608202016-c25gvf1ob7ypbus6-2
    ------------------------------------------------------------
    revno: 3172.3.1.1.1
    revision-id:robertc at robertcollins.net-20080111043302-0pi5csqyr1ugry00
    parent: robertc at robertcollins.net-20080111035451-52at4031ohbmtoh2
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: more-find-ghosts
    timestamp: Fri 2008-01-11 15:33:02 +1100
    message:
       * Fetching via bzr+ssh will no longer fill ghosts by default (this is
         consistent with pack-0.92 fetching over SFTP). (Robert Collins)
      
       * Fetching now passes the find_ghosts flag through to the 
         ``InterRepository.missing_revision_ids`` call consistently for all
         repository types. This will enable faster missing revision discovery with
         bzr+ssh. (Robert Collins)
    modified:
      NEWS                           NEWS-20050323055033-4e00b5db738777ff
      bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
      bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
=== modified file 'NEWS'
--- a/NEWS	2008-01-15 02:01:29 +0000
+++ b/NEWS	2008-01-15 06:15:54 +0000
@@ -14,6 +14,9 @@
      been made to remove the ambiguity where ``branch2`` is in fact a
      specific file to diff within ``branch1``.
 
+   * Fetching via bzr+ssh will no longer fill ghosts by default (this is
+     consistent with pack-0.92 fetching over SFTP). (Robert Collins)
+
   FEATURES:
 
    * New option to use custom template-based formats in  ``bzr version-info``.
@@ -158,6 +161,11 @@
    * Add -Dtimes debug flag, which records a timestamp against each mutter to
      the trace file, relative to the first mutter.  (Andrew Bennetts)
    
+   * Fetching now passes the find_ghosts flag through to the 
+     ``InterRepository.missing_revision_ids`` call consistently for all
+     repository types. This will enable faster missing revision discovery with
+     bzr+ssh. (Robert Collins)
+
    * find_* methods available for BzrDirs, Branches and WorkingTrees.
      (Aaron Bentley)
 

=== modified file 'bzrlib/fetch.py'
--- a/bzrlib/fetch.py	2008-01-11 05:08:20 +0000
+++ b/bzrlib/fetch.py	2008-01-14 03:00:08 +0000
@@ -75,7 +75,13 @@
     This should not be used directly, it's essential a object to encapsulate
     the logic in InterRepository.fetch().
     """
-    def __init__(self, to_repository, from_repository, last_revision=None, pb=None):
+
+    def __init__(self, to_repository, from_repository, last_revision=None, pb=None,
+        find_ghosts=True):
+        """Create a repo fetcher.
+
+        :param find_ghosts: If True search the entire history for ghosts.
+        """
         # result variables.
         self.failed_revisions = []
         self.count_copied = 0
@@ -88,6 +94,7 @@
         self.from_repository = from_repository
         # must not mutate self._last_revision as its potentially a shared instance
         self._last_revision = last_revision
+        self.find_ghosts = find_ghosts
         if pb is None:
             self.pb = bzrlib.ui.ui_factory.nested_progress_bar()
             self.nested_pb = self.pb
@@ -196,7 +203,7 @@
             # XXX: this gets the full graph on both sides, and will make sure
             # that ghosts are filled whether or not you care about them.
             return self.to_repository.missing_revision_ids(self.from_repository,
-                                                           self._last_revision)
+                self._last_revision, find_ghosts=self.find_ghosts)
         except errors.NoSuchRevision:
             raise InstallFailed([self._last_revision])
 
@@ -329,6 +336,8 @@
 
         :param revs: A list of revision ids
         """
+        # In case that revs is not a list.
+        revs = list(revs)
         while revs:
             for tree in self.source.revision_trees(revs[:100]):
                 if tree.inventory.revision_id is None:
@@ -372,10 +381,10 @@
     """Fetch from a Model1 repository into a Knit2 repository
     """
     def __init__(self, to_repository, from_repository, last_revision=None,
-                 pb=None):
+                 pb=None, find_ghosts=True):
         self.helper = Inter1and2Helper(from_repository, to_repository)
         GenericRepoFetcher.__init__(self, to_repository, from_repository,
-                                    last_revision, pb)
+            last_revision, pb, find_ghosts)
 
     def _generate_root_texts(self, revs):
         self.helper.generate_root_texts(revs)
@@ -388,10 +397,10 @@
     """Fetch from a Knit1 repository into a Knit2 repository"""
 
     def __init__(self, to_repository, from_repository, last_revision=None, 
-                 pb=None):
+                 pb=None, find_ghosts=True):
         self.helper = Inter1and2Helper(from_repository, to_repository)
         KnitRepoFetcher.__init__(self, to_repository, from_repository,
-                                 last_revision, pb)
+            last_revision, pb, find_ghosts)
 
     def _generate_root_texts(self, revs):
         self.helper.generate_root_texts(revs)

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2008-01-11 08:32:51 +0000
+++ b/bzrlib/repository.py	2008-01-14 03:00:08 +0000
@@ -2309,6 +2309,39 @@
         (copied, failures).
         """
         raise NotImplementedError(self.fetch)
+
+    def _walk_to_common_revisions(self, revision_ids):
+        """Walk out from revision_ids in source to revisions target has.
+
+        :param revision_ids: The start point for the search.
+        :return: A set of revision ids.
+        """
+        graph = self.source.get_graph()
+        missing_revs = set()
+        # ensure we don't pay silly lookup costs.
+        revision_ids = frozenset(revision_ids)
+        searcher = graph._make_breadth_first_searcher(revision_ids)
+        null_set = frozenset([_mod_revision.NULL_REVISION])
+        while True:
+            try:
+                next_revs, ghosts = searcher.next_with_ghosts()
+            except StopIteration:
+                break
+            if revision_ids.intersection(ghosts):
+                absent_ids = set(revision_ids.intersection(ghosts))
+                # If all absent_ids are present in target, no error is needed.
+                absent_ids.difference_update(
+                    self.target.has_revisions(absent_ids))
+                if absent_ids:
+                    raise errors.NoSuchRevision(self.source, absent_ids.pop())
+            # we don't care about other ghosts as we can't fetch them and
+            # haven't been asked to.
+            next_revs = set(next_revs)
+            next_revs.difference_update(null_set)
+            have_revs = self.target.has_revisions(next_revs)
+            missing_revs.update(next_revs - have_revs)
+            searcher.stop_searching_any(have_revs)
+        return missing_revs
    
     @needs_read_lock
     def missing_revision_ids(self, revision_id=None, find_ghosts=True):
@@ -2318,7 +2351,12 @@
 
         :param revision_id: only return revision ids included by this
                             revision_id.
+        :param find_ghosts: If True find missing revisions in deep history
+            rather than just finding the surface difference.
         """
+        # stop searching at found target revisions.
+        if not find_ghosts and revision_id is not None:
+            return self._walk_to_common_revisions([revision_id])
         # generic, possibly worst case, slow code path.
         target_ids = set(self.target.all_revision_ids())
         if revision_id is not None:
@@ -2396,7 +2434,7 @@
         f = GenericRepoFetcher(to_repository=self.target,
                                from_repository=self.source,
                                last_revision=revision_id,
-                               pb=pb)
+                               pb=pb, find_ghosts=find_ghosts)
         return f.count_copied, f.failed_revisions
 
 
@@ -2474,7 +2512,7 @@
         f = GenericRepoFetcher(to_repository=self.target,
                                from_repository=self.source,
                                last_revision=revision_id,
-                               pb=pb)
+                               pb=pb, find_ghosts=find_ghosts)
         return f.count_copied, f.failed_revisions
 
     @needs_read_lock
@@ -2552,7 +2590,7 @@
         f = KnitRepoFetcher(to_repository=self.target,
                             from_repository=self.source,
                             last_revision=revision_id,
-                            pb=pb)
+                            pb=pb, find_ghosts=find_ghosts)
         return f.count_copied, f.failed_revisions
 
     @needs_read_lock
@@ -2657,30 +2695,11 @@
     def missing_revision_ids(self, revision_id=None, find_ghosts=True):
         """See InterRepository.missing_revision_ids().
         
-        :param find_ghosts: Find ghosts throughough the ancestry of
+        :param find_ghosts: Find ghosts throughout the ancestry of
             revision_id.
         """
         if not find_ghosts and revision_id is not None:
-            graph = self.source.get_graph()
-            missing_revs = set()
-            searcher = graph._make_breadth_first_searcher([revision_id])
-            null_set = frozenset([_mod_revision.NULL_REVISION])
-            while True:
-                try:
-                    next_revs = set(searcher.next())
-                except StopIteration:
-                    break
-                next_revs.difference_update(null_set)
-                have_revs = self.target.has_revisions(next_revs)
-                missing_revs.update(next_revs - have_revs)
-                searcher.stop_searching_any(have_revs)
-            if next_revs - have_revs == set([revision_id]):
-                # we saw the start rev itself, but no parents from it (or
-                # next_revs would have been updated to e.g. set(). We remove
-                # have_revs because if we found revision_id locally we
-                # stop_searching at the first time around.
-                raise errors.NoSuchRevision(self.source, revision_id)
-            return missing_revs
+            return self._walk_to_common_revisions([revision_id])
         elif revision_id is not None:
             source_ids = self.source.get_ancestry(revision_id)
             assert source_ids[0] is None
@@ -2715,7 +2734,7 @@
         f = Model1toKnit2Fetcher(to_repository=self.target,
                                  from_repository=self.source,
                                  last_revision=revision_id,
-                                 pb=pb)
+                                 pb=pb, find_ghosts=find_ghosts)
         return f.count_copied, f.failed_revisions
 
     @needs_write_lock
@@ -2772,7 +2791,7 @@
         f = Knit1to2Fetcher(to_repository=self.target,
                             from_repository=self.source,
                             last_revision=revision_id,
-                            pb=pb)
+                            pb=pb, find_ghosts=find_ghosts)
         return f.count_copied, f.failed_revisions
 
 
@@ -2798,7 +2817,7 @@
     def fetch(self, revision_id=None, pb=None, find_ghosts=False):
         """See InterRepository.fetch()."""
         revision_ids = self.target.missing_revision_ids(self.source,
-                                                        revision_id)
+            revision_id, find_ghosts=find_ghosts)
         def revisions_iterator():
             for current_revision_id in revision_ids:
                 revision = self.source.get_revision(current_revision_id)
@@ -2851,7 +2870,7 @@
         f = RemoteToOtherFetcher(to_repository=self.target,
                                  from_repository=self.source,
                                  last_revision=revision_id,
-                                 pb=pb)
+                                 pb=pb, find_ghosts=find_ghosts)
         return f.count_copied, f.failed_revisions
 
     @classmethod
@@ -2883,7 +2902,8 @@
 
     def fetch(self, revision_id=None, pb=None, find_ghosts=False):
         self._ensure_real_inter()
-        self._real_inter.fetch(revision_id=revision_id, pb=pb)
+        self._real_inter.fetch(revision_id=revision_id, pb=pb,
+            find_ghosts=find_ghosts)
 
     @classmethod
     def _get_repo_format_to_test(self):




More information about the bazaar-commits mailing list