Rev 3443: (jam) refactor find_unique_ancestors for clarity, in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Wed May 21 06:46:41 BST 2008
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 3443
revision-id:pqm at pqm.ubuntu.com-20080521054623-5ayhndccwr6o45uq
parent: pqm at pqm.ubuntu.com-20080521031911-acjem4ky3hjs8gjh
parent: john at arbash-meinel.com-20080521021300-2pqe6eso7w3jrd0u
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Wed 2008-05-21 06:46:23 +0100
message:
(jam) refactor find_unique_ancestors for clarity,
and collapse searchers for performance
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
bzrlib/tests/test_graph.py test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
------------------------------------------------------------
revno: 3377.3.42.1.9
revision-id:john at arbash-meinel.com-20080521021300-2pqe6eso7w3jrd0u
parent: john at arbash-meinel.com-20080521015055-gs6pzl7zpbcyj2p5
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-20 21:13:00 -0500
message:
STEP every 5
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.8
revision-id:john at arbash-meinel.com-20080521015055-gs6pzl7zpbcyj2p5
parent: john at arbash-meinel.com-20080520232241-hodoh6db5wslse1o
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-20 20:50:55 -0500
message:
Final tweaks from Ian
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.7
revision-id:john at arbash-meinel.com-20080520232241-hodoh6db5wslse1o
parent: john at arbash-meinel.com-20080520231020-wh8g3pq0whu2p78i
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-20 18:22:41 -0500
message:
Small documentation and code wrapping cleanup
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.6
revision-id:john at arbash-meinel.com-20080520231020-wh8g3pq0whu2p78i
parent: john at arbash-meinel.com-20080513221726-vqpvwthr1tncpb29
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-20 18:10:20 -0500
message:
Lots of refactoring for find_unique_ancestors.
Split up the function into a bunch of helper functions.
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.5
revision-id:john at arbash-meinel.com-20080513221726-vqpvwthr1tncpb29
parent: john at arbash-meinel.com-20080507220932-1ofutrrd6eu1dtdy
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-13 17:17:26 -0500
message:
Several updates. A bit more debug logging, only step the all_unique searcher 1/10th of the time.
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.4
revision-id:john at arbash-meinel.com-20080507220932-1ofutrrd6eu1dtdy
parent: john at arbash-meinel.com-20080506220718-3nqdforqiwifqvga
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Wed 2008-05-07 17:09:32 -0500
message:
Restore the previous code, but bring in a couple changes. Including an update to have lsprof show where the time is spent.
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.3
revision-id:john at arbash-meinel.com-20080506220718-3nqdforqiwifqvga
parent: john at arbash-meinel.com-20080506205431-sr3y667ytc6t1pdu
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-06 17:07:18 -0500
message:
try a different method which collects searchers together. However, it ends up performing worse than the existing form
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
------------------------------------------------------------
revno: 3377.3.42.1.2
revision-id:john at arbash-meinel.com-20080506205431-sr3y667ytc6t1pdu
parent: john at arbash-meinel.com-20080501224934-3i8kxtuyr9r21711
parent: pqm at pqm.ubuntu.com-20080505231558-7w3zaehbvtcjk7jv
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Tue 2008-05-06 15:54:31 -0500
message:
Merge in the bzr.dev changes
added:
bzrlib/tests/branch_implementations/test_check.py test_check.py-20080429151303-1sbfclxhddpz0tnj-1
bzrlib/tests/branch_implementations/test_reconcile.py test_reconcile.py-20080429161555-qlmccuyeyt6pvho7-1
bzrlib/tests/interrepository_implementations/test_fetch.py test_fetch.py-20080425213627-j60cjh782ufm83ry-1
modified:
Makefile Makefile-20050805140406-d96e3498bb61c5bb
NEWS NEWS-20050323055033-4e00b5db738777ff
bzr bzr.py-20050313053754-5485f144c7006fa6
bzrlib/__init__.py __init__.py-20050309040759-33e65acf91bbcd5d
bzrlib/branch.py branch.py-20050309040759-e4baf4e0d046576e
bzrlib/builtins.py builtins.py-20050830033751-fc01482b9ca23183
bzrlib/bundle/bundle_data.py read_changeset.py-20050619171944-c0d95aa685537640
bzrlib/bundle/serializer/v4.py v10.py-20070611062757-5ggj7k18s9dej0fr-1
bzrlib/config.py config.py-20051011043216-070c74f4e9e338e8
bzrlib/diff.py diff.py-20050309040759-26944fbbf2ebbf36
bzrlib/errors.py errors.py-20050309040759-20512168c4e14fbd
bzrlib/fetch.py fetch.py-20050818234941-26fea6105696365d
bzrlib/help_topics/__init__.py help_topics.py-20060920210027-rnim90q9e0bwxvy4-1
bzrlib/hooks.py hooks.py-20070325015548-ix4np2q0kd8452au-1
bzrlib/knit.py knit.py-20051212171256-f056ac8f0fbe1bd9
bzrlib/lockable_files.py control_files.py-20051111201905-bb88546e799d669f
bzrlib/log.py log.py-20050505065812-c40ce11702fe5fb1
bzrlib/mutabletree.py mutabletree.py-20060906023413-4wlkalbdpsxi2r4y-2
bzrlib/reconcile.py reweave_inventory.py-20051108164726-1e5e0934febac06e
bzrlib/reconfigure.py reconfigure.py-20070908040425-6ykgo7escxhyrg9p-1
bzrlib/remote.py remote.py-20060720103555-yeeg2x51vn0rbtdp-1
bzrlib/repofmt/knitrepo.py knitrepo.py-20070206081537-pyy4a00xdas0j4pf-1
bzrlib/repository.py rev_storage.py-20051111201905-119e9401e46257e3
bzrlib/revisionspec.py revisionspec.py-20050907152633-17567659fd5c0ddb
bzrlib/revisiontree.py revisiontree.py-20060724012533-bg8xyryhxd0o0i0h-1
bzrlib/smart/repository.py repository.py-20061128022038-vr5wy5bubyb8xttk-1
bzrlib/smart/server.py server.py-20061110062051-chzu10y32vx8gvur-1
bzrlib/status.py status.py-20050505062338-431bfa63ec9b19e6
bzrlib/store/revision/knit.py knit.py-20060303020652-de5fa299e941a3c7
bzrlib/store/revision/text.py text.py-20060303020652-e49155f0da4d14ab
bzrlib/symbol_versioning.py symbol_versioning.py-20060105104851-9ecf8af605d15a80
bzrlib/tests/__init__.py selftest.py-20050531073622-8d0e3c8845c97a64
bzrlib/tests/blackbox/test_branch.py test_branch.py-20060524161337-noms9gmcwqqrfi8y-1
bzrlib/tests/blackbox/test_hooks.py test_hooks.py-20080308163236-xljgf9j41hik1x21-1
bzrlib/tests/blackbox/test_reconcile.py test_fix.py-20060223013051-9a188e15a5ee9451
bzrlib/tests/blackbox/test_version.py test_version.py-20070312060045-ol7th9z035r3im3d-1
bzrlib/tests/branch_implementations/__init__.py __init__.py-20060123013057-b12a52c3f361daf4
bzrlib/tests/branch_implementations/test_branch.py testbranch.py-20050711070244-121d632bc37d7253
bzrlib/tests/branch_implementations/test_commit.py test_commit.py-20070206022134-117z1i5b644p63r0-1
bzrlib/tests/branch_implementations/test_hooks.py test_hooks.py-20070129154855-blhpwxmvjs07waei-1
bzrlib/tests/branch_implementations/test_pull.py test_pull.py-20060410103942-83c35b26657414fc
bzrlib/tests/branch_implementations/test_push.py test_push.py-20070130153159-fhfap8uoifevg30j-1
bzrlib/tests/branch_implementations/test_uncommit.py test_uncommit.py-20070205180410-ge7058d9138mvq3x-1
bzrlib/tests/interrepository_implementations/__init__.py __init__.py-20060220054744-baf49a1f88f17b1a
bzrlib/tests/interrepository_implementations/test_interrepository.py test_interrepository.py-20060220061411-1ec13fa99e5e3eee
bzrlib/tests/repository_implementations/test_fetch.py test_fetch.py-20070814052151-5cxha9slx4c93uog-1
bzrlib/tests/repository_implementations/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
bzrlib/tests/test_bundle.py test.py-20050630184834-092aa401ab9f039c
bzrlib/tests/test_config.py testconfig.py-20051011041908-742d0c15d8d8c8eb
bzrlib/tests/test_fetch.py testfetch.py-20050825090644-f73e07e7dfb1765a
bzrlib/tests/test_graph.py test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
bzrlib/tests/test_hooks.py test_hooks.py-20070628030849-89rtsbe5dmer5npz-1
bzrlib/tests/test_knit.py test_knit.py-20051212171302-95d4c00dd5f11f2b
bzrlib/tests/test_lockable_files.py test_lockable_files.py-20051225183927-365c7fd99591caf1
bzrlib/tests/test_osutils.py test_osutils.py-20051201224856-e48ee24c12182989
bzrlib/tests/test_reconcile.py test_reconcile.py-20060225054842-50aa618584a86f26
bzrlib/tests/test_reconfigure.py test_reconfigure.py-20070908040425-6ykgo7escxhyrg9p-2
bzrlib/tests/test_remote.py test_remote.py-20060720103555-yeeg2x51vn0rbtdp-2
bzrlib/tests/test_repository.py test_repository.py-20060131075918-65c555b881612f4d
bzrlib/tests/test_selftest.py test_selftest.py-20051202044319-c110a115d8c0456a
bzrlib/tests/test_smart.py test_smart.py-20061122024551-ol0l0o0oofsu9b3t-2
bzrlib/tests/test_smart_transport.py test_ssh_transport.py-20060608202016-c25gvf1ob7ypbus6-2
bzrlib/tests/test_store.py teststore.py-20050826022702-f6caadb647395769
bzrlib/tests/transport_util.py transportutil.py-20070525113600-5v2igk89s8fensom-1
bzrlib/tests/tree_implementations/__init__.py __init__.py-20060717075546-420s7b0bj9hzeowi-2
bzrlib/tests/tree_implementations/test_inv.py test_inv.py-20070312023226-0cdvk5uwhutis9vg-1
bzrlib/tests/tree_implementations/test_test_trees.py test_tree_trees.py-20060720091921-3nwi5h21lf06vf5p-1
bzrlib/tests/workingtree_implementations/test_basis_inventory.py test_basis_inventory.py-20051218151655-3650468941091309
bzrlib/tests/workingtree_implementations/test_commit.py test_commit.py-20060421013633-1610ec2331c8190f
bzrlib/transform.py transform.py-20060105172343-dd99e54394d91687
bzrlib/transport/ftp.py ftp.py-20051116161804-58dc9506548c2a53
bzrlib/tree.py tree.py-20050309040759-9d5f2496be663e77
bzrlib/workingtree.py workingtree.py-20050511021032-29b6ec0a681e02e3
bzrlib/workingtree_4.py workingtree_4.py-20070208044105-5fgpc5j3ljlh5q6c-1
doc/default.css default.css-20060622101119-tgwtdci8z769bjb9-1
doc/developers/HACKING.txt HACKING-20050805200004-2a5dc975d870f78c
doc/developers/network-protocol.txt networkprotocol.txt-20070903044232-woustorrjbmg5zol-1
doc/en/user-guide/controlling_registration.txt controlling_registra-20071121073725-0corxykv5irjal00-3
doc/en/user-guide/hooks.txt hooks.txt-20070829200551-7nr6e5a1io6x78uf-1
doc/en/user-guide/sending_changes.txt sending_changes.txt-20071123154453-dk2mjhrg1vpjm5w2-4
------------------------------------------------------------
revno: 3377.3.42.1.1
revision-id:john at arbash-meinel.com-20080501224934-3i8kxtuyr9r21711
parent: john at arbash-meinel.com-20080501204100-r9pw4d1znr8g728x
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: find_unique_ancestors
timestamp: Thu 2008-05-01 17:49:34 -0500
message:
Work on a way to collapse the searchers earlier.
modified:
bzrlib/graph.py graph_walker.py-20070525030359-y852guab65d4wtn0-1
bzrlib/tests/test_graph.py test_graph_walker.py-20070525030405-enq4r60hhi9xrujc-1
=== modified file 'bzrlib/graph.py'
--- a/bzrlib/graph.py 2008-05-08 04:41:46 +0000
+++ b/bzrlib/graph.py 2008-05-21 05:46:23 +0000
@@ -14,6 +14,8 @@
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+import time
+
from bzrlib import (
debug,
errors,
@@ -24,6 +26,8 @@
)
from bzrlib.deprecated_graph import (node_distances, select_farthest)
+STEP_UNIQUE_SEARCHER_EVERY = 5
+
# DIAGRAM of terminology
# A
# /\
@@ -236,19 +240,49 @@
# information you have so far.
# 5) Continue searching, stopping the common searches when the search
# tip is an ancestor of all unique nodes.
- # 6) Search is done when all common searchers have completed.
-
+ # 6) Aggregate together unique searchers when they are searching the
+ # same tips. When all unique searchers are searching the same node,
+ # stop move it to a single 'all_unique_searcher'.
+ # 7) The 'all_unique_searcher' represents the very 'tip' of searching.
+ # Most of the time this produces very little important information.
+ # So don't step it as quickly as the other searchers.
+ # 8) Search is done when all common searchers have completed.
+
+ unique_searcher, common_searcher = self._find_initial_unique_nodes(
+ [unique_revision], common_revisions)
+
+ unique_nodes = unique_searcher.seen.difference(common_searcher.seen)
+ if not unique_nodes:
+ return unique_nodes
+
+ (all_unique_searcher,
+ unique_tip_searchers) = self._make_unique_searchers(unique_nodes,
+ unique_searcher, common_searcher)
+
+ self._refine_unique_nodes(unique_searcher, all_unique_searcher,
+ unique_tip_searchers, common_searcher)
+ true_unique_nodes = unique_nodes.difference(common_searcher.seen)
if 'graph' in debug.debug_flags:
- _mutter = trace.mutter
- else:
- def _mutter(*args, **kwargs):
- pass
-
- unique_searcher = self._make_breadth_first_searcher([unique_revision])
- # we know that unique_revision isn't in common_revisions
+ trace.mutter('Found %d truly unique nodes out of %d',
+ len(true_unique_nodes), len(unique_nodes))
+ return true_unique_nodes
+
+ def _find_initial_unique_nodes(self, unique_revisions, common_revisions):
+ """Steps 1-3 of find_unique_ancestors.
+
+ Find the maximal set of unique nodes. Some of these might actually
+ still be common, but we are sure that there are no other unique nodes.
+
+ :return: (unique_searcher, common_searcher)
+ """
+
+ unique_searcher = self._make_breadth_first_searcher(unique_revisions)
+ # we know that unique_revisions aren't in common_revisions, so skip
+ # past them.
unique_searcher.next()
common_searcher = self._make_breadth_first_searcher(common_revisions)
+ # As long as we are still finding unique nodes, keep searching
while unique_searcher._next_query:
next_unique_nodes = set(unique_searcher.step())
next_common_nodes = set(common_searcher.step())
@@ -262,120 +296,226 @@
if unique_are_common_nodes:
ancestors = unique_searcher.find_seen_ancestors(
unique_are_common_nodes)
+ # TODO: This is a bit overboard, we only really care about
+ # the ancestors of the tips because the rest we
+ # already know. This is *correct* but causes us to
+ # search too much ancestry.
ancestors.update(common_searcher.find_seen_ancestors(ancestors))
unique_searcher.stop_searching_any(ancestors)
common_searcher.start_searching(ancestors)
- unique_nodes = unique_searcher.seen.difference(common_searcher.seen)
- if not unique_nodes:
- return unique_nodes
+ return unique_searcher, common_searcher
+
+ def _make_unique_searchers(self, unique_nodes, unique_searcher,
+ common_searcher):
+ """Create a searcher for all the unique search tips (step 4).
+
+ As a side effect, the common_searcher will stop searching any nodes
+ that are ancestors of the unique searcher tips.
+
+ :return: (all_unique_searcher, unique_tip_searchers)
+ """
unique_tips = self._remove_simple_descendants(unique_nodes,
self.get_parent_map(unique_nodes))
if len(unique_tips) == 1:
- unique_searchers = []
+ unique_tip_searchers = []
ancestor_all_unique = unique_searcher.find_seen_ancestors(unique_tips)
else:
- unique_searchers = []
+ unique_tip_searchers = []
for tip in unique_tips:
revs_to_search = unique_searcher.find_seen_ancestors([tip])
+ revs_to_search.update(
+ common_searcher.find_seen_ancestors(revs_to_search))
searcher = self._make_breadth_first_searcher(revs_to_search)
# We don't care about the starting nodes.
searcher._label = tip
searcher.step()
- unique_searchers.append(searcher)
+ unique_tip_searchers.append(searcher)
ancestor_all_unique = None
- for searcher in unique_searchers:
+ for searcher in unique_tip_searchers:
if ancestor_all_unique is None:
ancestor_all_unique = set(searcher.seen)
else:
ancestor_all_unique = ancestor_all_unique.intersection(
searcher.seen)
# Collapse all the common nodes into a single searcher
- all_unique_searcher = self._make_breadth_first_searcher(ancestor_all_unique)
+ all_unique_searcher = self._make_breadth_first_searcher(
+ ancestor_all_unique)
if ancestor_all_unique:
+ # We've seen these nodes in all the searchers, so we'll just go to
+ # the next
all_unique_searcher.step()
# Stop any search tips that are already known as ancestors of the
# unique nodes
- common_searcher.stop_searching_any(
+ stopped_common = common_searcher.stop_searching_any(
common_searcher.find_seen_ancestors(ancestor_all_unique))
total_stopped = 0
- for searcher in unique_searchers:
+ for searcher in unique_tip_searchers:
total_stopped += len(searcher.stop_searching_any(
searcher.find_seen_ancestors(ancestor_all_unique)))
- _mutter('For %s unique nodes, created %s + 1 unique searchers'
- ' (%s stopped search tips, %s common ancestors)',
- len(unique_nodes), len(unique_searchers), total_stopped,
- len(ancestor_all_unique))
- del ancestor_all_unique
-
+ if 'graph' in debug.debug_flags:
+ trace.mutter('For %d unique nodes, created %d + 1 unique searchers'
+ ' (%d stopped search tips, %d common ancestors'
+ ' (%d stopped common)',
+ len(unique_nodes), len(unique_tip_searchers),
+ total_stopped, len(ancestor_all_unique),
+ len(stopped_common))
+ return all_unique_searcher, unique_tip_searchers
+
+ def _step_unique_and_common_searchers(self, common_searcher,
+ unique_tip_searchers,
+ unique_searcher):
+ """Step all the searchers"""
+ newly_seen_common = set(common_searcher.step())
+ newly_seen_unique = set()
+ for searcher in unique_tip_searchers:
+ next = set(searcher.step())
+ next.update(unique_searcher.find_seen_ancestors(next))
+ next.update(common_searcher.find_seen_ancestors(next))
+ for alt_searcher in unique_tip_searchers:
+ if alt_searcher is searcher:
+ continue
+ next.update(alt_searcher.find_seen_ancestors(next))
+ searcher.start_searching(next)
+ newly_seen_unique.update(next)
+ return newly_seen_common, newly_seen_unique
+
+ def _find_nodes_common_to_all_unique(self, unique_tip_searchers,
+ all_unique_searcher,
+ newly_seen_unique, step_all_unique):
+ """Find nodes that are common to all unique_tip_searchers.
+
+ If it is time, step the all_unique_searcher, and add its nodes to the
+ result.
+ """
+ common_to_all_unique_nodes = newly_seen_unique.copy()
+ for searcher in unique_tip_searchers:
+ common_to_all_unique_nodes.intersection_update(searcher.seen)
+ common_to_all_unique_nodes.intersection_update(
+ all_unique_searcher.seen)
+ # Step all-unique less frequently than the other searchers.
+ # In the common case, we don't need to spider out far here, so
+ # avoid doing extra work.
+ if step_all_unique:
+ tstart = time.clock()
+ nodes = all_unique_searcher.step()
+ common_to_all_unique_nodes.update(nodes)
+ if 'graph' in debug.debug_flags:
+ tdelta = time.clock() - tstart
+ trace.mutter('all_unique_searcher step() took %.3fs'
+ 'for %d nodes (%d total), iteration: %s',
+ tdelta, len(nodes), len(all_unique_searcher.seen),
+ all_unique_searcher._iterations)
+ return common_to_all_unique_nodes
+
+ def _collapse_unique_searchers(self, unique_tip_searchers,
+ common_to_all_unique_nodes):
+ """Combine searchers that are searching the same tips.
+
+ When two searchers are searching the same tips, we can stop one of the
+ searchers. We also know that the maximal set of common ancestors is the
+ intersection of the two original searchers.
+
+ :return: A list of searchers that are searching unique nodes.
+ """
+ # Filter out searchers that don't actually search different
+ # nodes. We already have the ancestry intersection for them
+ unique_search_tips = {}
+ for searcher in unique_tip_searchers:
+ stopped = searcher.stop_searching_any(common_to_all_unique_nodes)
+ will_search_set = frozenset(searcher._next_query)
+ if not will_search_set:
+ if 'graph' in debug.debug_flags:
+ trace.mutter('Unique searcher %s was stopped.'
+ ' (%s iterations) %d nodes stopped',
+ searcher._label,
+ searcher._iterations,
+ len(stopped))
+ elif will_search_set not in unique_search_tips:
+ # This searcher is searching a unique set of nodes, let it
+ unique_search_tips[will_search_set] = [searcher]
+ else:
+ unique_search_tips[will_search_set].append(searcher)
+ # TODO: it might be possible to collapse searchers faster when they
+ # only have *some* search tips in common.
+ next_unique_searchers = []
+ for searchers in unique_search_tips.itervalues():
+ if len(searchers) == 1:
+ # Searching unique tips, go for it
+ next_unique_searchers.append(searchers[0])
+ else:
+ # These searchers have started searching the same tips, we
+ # don't need them to cover the same ground. The
+ # intersection of their ancestry won't change, so create a
+ # new searcher, combining their histories.
+ next_searcher = searchers[0]
+ for searcher in searchers[1:]:
+ next_searcher.seen.intersection_update(searcher.seen)
+ if 'graph' in debug.debug_flags:
+ trace.mutter('Combining %d searchers into a single'
+ ' searcher searching %d nodes with'
+ ' %d ancestry',
+ len(searchers),
+ len(next_searcher._next_query),
+ len(next_searcher.seen))
+ next_unique_searchers.append(next_searcher)
+ return next_unique_searchers
+
+ def _refine_unique_nodes(self, unique_searcher, all_unique_searcher,
+ unique_tip_searchers, common_searcher):
+ """Steps 5-8 of find_unique_ancestors.
+
+ This function returns when common_searcher has stopped searching for
+ more nodes.
+ """
+ # We step the ancestor_all_unique searcher only every
+ # STEP_UNIQUE_SEARCHER_EVERY steps.
+ step_all_unique_counter = 0
# While we still have common nodes to search
while common_searcher._next_query:
- newly_seen_common = set(common_searcher.step())
- newly_seen_unique = set()
- for searcher in unique_searchers:
- newly_seen_unique.update(searcher.step())
+ (newly_seen_common,
+ newly_seen_unique) = self._step_unique_and_common_searchers(
+ common_searcher, unique_tip_searchers, unique_searcher)
# These nodes are common ancestors of all unique nodes
- unique_are_common_nodes = newly_seen_unique.copy()
- for searcher in unique_searchers:
- unique_are_common_nodes = unique_are_common_nodes.intersection(
- searcher.seen)
- unique_are_common_nodes = unique_are_common_nodes.intersection(
- all_unique_searcher.seen)
- unique_are_common_nodes.update(all_unique_searcher.step())
+ common_to_all_unique_nodes = self._find_nodes_common_to_all_unique(
+ unique_tip_searchers, all_unique_searcher, newly_seen_unique,
+ step_all_unique_counter==0)
+ step_all_unique_counter = ((step_all_unique_counter + 1)
+ % STEP_UNIQUE_SEARCHER_EVERY)
+
if newly_seen_common:
# If a 'common' node is an ancestor of all unique searchers, we
# can stop searching it.
common_searcher.stop_searching_any(
all_unique_searcher.seen.intersection(newly_seen_common))
- if unique_are_common_nodes:
- # We have new common-to-all-unique-searchers nodes
- for searcher in unique_searchers:
- unique_are_common_nodes.update(
- searcher.find_seen_ancestors(unique_are_common_nodes))
- unique_are_common_nodes.update(
- all_unique_searcher.find_seen_ancestors(unique_are_common_nodes))
- # Since these are common, we can grab another set of ancestors
- # that we have seen
- unique_are_common_nodes.update(
- common_searcher.find_seen_ancestors(unique_are_common_nodes))
-
+ if common_to_all_unique_nodes:
+ common_to_all_unique_nodes.update(
+ common_searcher.find_seen_ancestors(
+ common_to_all_unique_nodes))
# The all_unique searcher can start searching the common nodes
# but everyone else can stop.
- all_unique_searcher.start_searching(unique_are_common_nodes)
- common_searcher.stop_searching_any(unique_are_common_nodes)
+ # This is the sort of thing where we would like to not have it
+ # start_searching all of the nodes, but only mark all of them
+ # as seen, and have it search only the actual tips. Otherwise
+ # it is another get_parent_map() traversal for it to figure out
+ # what we already should know.
+ all_unique_searcher.start_searching(common_to_all_unique_nodes)
+ common_searcher.stop_searching_any(common_to_all_unique_nodes)
- # Filter out searchers that don't actually search different
- # nodes. We already have the ancestry intersection for them
- next_unique_searchers = []
- unique_search_sets = set()
- for searcher in unique_searchers:
- stopped = searcher.stop_searching_any(unique_are_common_nodes)
- will_search_set = frozenset(searcher._next_query)
- if not will_search_set:
- _mutter('Unique searcher %s was stopped.'
- ' (%s iterations) %d nodes stopped',
- searcher._label,
- searcher._iterations,
- len(stopped))
- elif will_search_set not in unique_search_sets:
- # This searcher is searching a unique set of nodes, let it
- unique_search_sets.add(will_search_set)
- next_unique_searchers.append(searcher)
- else:
- _mutter('Unique searcher %s stopped for repeated'
- ' search of %s nodes',
- searcher._label, len(will_search_set))
- if len(unique_searchers) != len(next_unique_searchers):
- _mutter('Collapsed %s unique searchers => %s'
- ' at %s iterations',
- len(unique_searchers), len(next_unique_searchers),
- all_unique_searcher._iterations)
- unique_searchers = next_unique_searchers
- return unique_nodes.difference(common_searcher.seen)
+ next_unique_searchers = self._collapse_unique_searchers(
+ unique_tip_searchers, common_to_all_unique_nodes)
+ if len(unique_tip_searchers) != len(next_unique_searchers):
+ if 'graph' in debug.debug_flags:
+ trace.mutter('Collapsed %d unique searchers => %d'
+ ' at %s iterations',
+ len(unique_tip_searchers),
+ len(next_unique_searchers),
+ all_unique_searcher._iterations)
+ unique_tip_searchers = next_unique_searchers
@symbol_versioning.deprecated_method(symbol_versioning.one_one)
def get_parents(self, revisions):
@@ -1047,7 +1187,16 @@
return self
def find_seen_ancestors(self, revisions):
- """Find ancestors of these revisions that have already been seen."""
+ """Find ancestors of these revisions that have already been seen.
+
+ This function generally makes the assumption that querying for the
+ parents of a node that has already been queried is reasonably cheap.
+ (eg, not a round trip to a remote host).
+ """
+ # TODO: Often we might ask one searcher for its seen ancestors, and
+ # then ask another searcher the same question. This can result in
+ # searching the same revisions repeatedly if the two searchers
+ # have a lot of overlap.
all_seen = self.seen
pending = set(revisions).intersection(all_seen)
seen_ancestors = set(pending)
@@ -1082,6 +1231,9 @@
None of the specified revisions are required to be present in the
search list. In this case, the call is a no-op.
"""
+ # TODO: does this help performance?
+ # if not revisions:
+ # return set()
revisions = frozenset(revisions)
if self._returning == 'next':
stopped = self._next_query.intersection(revisions)
=== modified file 'bzrlib/tests/test_graph.py'
--- a/bzrlib/tests/test_graph.py 2008-05-05 20:29:06 +0000
+++ b/bzrlib/tests/test_graph.py 2008-05-06 20:54:31 +0000
@@ -283,11 +283,11 @@
# |/
# j
#
-# y is found to be common right away, but is the start of a long series of
+# x is found to be common right away, but is the start of a long series of
# common commits.
# o is actually common, but the i-j shortcut makes it look like it is actually
-# unique to j at first, you have to traverse all of y->o to find it.
-# q,n give the walker from j a common point to stop searching, as does p,f.
+# unique to j at first, you have to traverse all of x->o to find it.
+# q,m gives the walker from j a common point to stop searching, as does p,f.
# k-n exists so that the second pass still has nodes that are worth searching,
# rather than instantly cancelling the extra walker.
More information about the bazaar-commits
mailing list