Rev 2206: Merge GraphDelta support. in file:///home/robertc/source/baz/hpss-remotegraph/
Robert Collins
robertc at robertcollins.net
Mon Apr 9 23:19:47 BST 2007
At file:///home/robertc/source/baz/hpss-remotegraph/
------------------------------------------------------------
revno: 2206
revision-id: robertc at robertcollins.net-20070409221942-d0t2lemx16nyn2ro
parent: robertc at robertcollins.net-20070405093526-bfbpg82tlc7kldni
parent: robertc at robertcollins.net-20070403074302-fo3x80f961hh7e7f
committer: Robert Collins <robertc at robertcollins.net>
branch nick: hpss-remotegraph
timestamp: Tue 2007-04-10 08:19:42 +1000
message:
Merge GraphDelta support.
added:
bzrlib/tests/repository_implementations/test_revision_graph.py test_revision_graph.-20070331011036-7x4dtgs5sopm7kcq-1
doc/design/ design-20070401020531-rqa7d77i22idnj78-1
doc/design/server.txt server.txt-20070401020531-rqa7d77i22idnj78-2
modified:
bzrlib/errors.py errors.py-20050309040759-20512168c4e14fbd
bzrlib/graph.py graph.py-20050905070950-b47dce53236c5e48
bzrlib/revision.py revision.py-20050309040759-e77802c08f3999d5
bzrlib/tests/repository_implementations/__init__.py __init__.py-20060131092037-9564957a7d4a841b
bzrlib/tests/repository_implementations/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
bzrlib/tests/test_errors.py test_errors.py-20060210110251-41aba2deddf936a8
bzrlib/tests/test_graph.py testgraph.py-20050905070950-42e6c958106610fd
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.12
merged: robertc at robertcollins.net-20070403074302-fo3x80f961hh7e7f
parent: robertc at robertcollins.net-20070402072850-a7b90xg48rdujr63
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Tue 2007-04-03 17:43:02 +1000
message:
Make Graph.get_descendants be lazy, and return the internal reference to avoid copy-overhead. Also increase GraphDelta performance.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.11
merged: robertc at robertcollins.net-20070402072850-a7b90xg48rdujr63
parent: robertc at robertcollins.net-20070401113531-onqhpzb37lexq17a
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Mon 2007-04-02 17:28:50 +1000
message:
Teach GraphDelta to handle ghosts.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.10
merged: robertc at robertcollins.net-20070401113531-onqhpzb37lexq17a
parent: robertc at robertcollins.net-20070401100816-j0298tbd10xjsguv
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 21:35:31 +1000
message:
Implement a NthNodeSelector.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.9
merged: robertc at robertcollins.net-20070401100816-j0298tbd10xjsguv
parent: robertc at robertcollins.net-20070401073652-xhjw2gmfgjsv4tvi
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 20:08:16 +1000
message:
Finish unit tests for creating and applying GraphDelta's, fixing a bug with prefix-deltas, and update the walkthroughts in the server design documentation.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.8
merged: robertc at robertcollins.net-20070401073652-xhjw2gmfgjsv4tvi
parent: robertc at robertcollins.net-20070401073344-5seq59fommjsi374
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 17:36:52 +1000
message:
Test GraphDeltas to the end of a graph with multiple tails when the subgraph is partitioned.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.7
merged: robertc at robertcollins.net-20070401073344-5seq59fommjsi374
parent: robertc at robertcollins.net-20070401073118-vixkw83zk2plkkwz
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 17:33:44 +1000
message:
Test GraphDeltas to the end of a graph with multiple tails.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.6
merged: robertc at robertcollins.net-20070401073118-vixkw83zk2plkkwz
parent: robertc at robertcollins.net-20070401071322-2wi0ecpdtf0jrjm0
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 17:31:18 +1000
message:
Test GraphDeltas on multiple-step internal-subgraphs.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.5
merged: robertc at robertcollins.net-20070401071322-2wi0ecpdtf0jrjm0
parent: robertc at robertcollins.net-20070401063117-71od8scu0te38npy
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 17:13:22 +1000
message:
Basic graph delta support functional, using a new graph get_heads method.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.4
merged: robertc at robertcollins.net-20070401063117-71od8scu0te38npy
parent: robertc at robertcollins.net-20070401061938-xzhb1m77i7fzzkb6
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 16:31:17 +1000
message:
Tighten the algorithm requirements - tails are not required because the exclusion on misses will converge. Also add Nonempty to empty test.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.3
merged: robertc at robertcollins.net-20070401061938-xzhb1m77i7fzzkb6
parent: robertc at robertcollins.net-20070401021622-c63kkszulpt30wae
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 16:19:38 +1000
message:
Start work on GraphDelta support.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.2
merged: robertc at robertcollins.net-20070401021622-c63kkszulpt30wae
parent: robertc at robertcollins.net-20070331011121-inbkb9pi7lmgbfhs
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sun 2007-04-01 12:16:22 +1000
message:
Sketch out finding missing revisions in client server mode.
------------------------------------------------------------
revno: 2018.1.2.1.50.2.80.1.99.1.9.1.21.1.26.2.74.2.2.2.6.2.1
merged: robertc at robertcollins.net-20070331011121-inbkb9pi7lmgbfhs
parent: pqm at pqm.ubuntu.com-20070329191009-2b533e883212576c
committer: Robert Collins <robertc at robertcollins.net>
branch nick: netsim
timestamp: Sat 2007-03-31 11:11:21 +1000
message:
Split get_revision_graph tests out of repository_implementations.test_repository.
=== added file 'bzrlib/tests/repository_implementations/test_revision_graph.py'
--- a/bzrlib/tests/repository_implementations/test_revision_graph.py 1970-01-01 00:00:00 +0000
+++ b/bzrlib/tests/repository_implementations/test_revision_graph.py 2007-03-31 01:11:21 +0000
@@ -0,0 +1,80 @@
+# Copyright (C) 2007 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+"""Tests for revision graph access."""
+
+from bzrlib.errors import (NoSuchRevision,
+ )
+from bzrlib.revision import NULL_REVISION
+from bzrlib.tests.repository_implementations.test_repository import (
+ TestCaseWithComplexRepository,
+ )
+
+
+class TestRevisionGraph(TestCaseWithComplexRepository):
+
+ def test_get_revision_graph(self):
+ # we can get a mapping of id->parents for the entire revision graph or bits thereof.
+ self.assertEqual({'rev1':[],
+ 'rev2':['rev1'],
+ 'rev3':['rev2'],
+ 'rev4':['rev3'],
+ },
+ self.bzrdir.open_repository().get_revision_graph(None))
+ self.assertEqual({'rev1':[]},
+ self.bzrdir.open_repository().get_revision_graph('rev1'))
+ self.assertEqual({'rev1':[],
+ 'rev2':['rev1']},
+ self.bzrdir.open_repository().get_revision_graph('rev2'))
+ self.assertRaises(NoSuchRevision,
+ self.bzrdir.open_repository().get_revision_graph,
+ 'orphan')
+ # and ghosts are not mentioned
+ self.assertEqual({'rev1':[],
+ 'rev2':['rev1'],
+ 'rev3':['rev2'],
+ },
+ self.bzrdir.open_repository().get_revision_graph('rev3'))
+ # and we can ask for the NULLREVISION graph
+ self.assertEqual({},
+ self.bzrdir.open_repository().get_revision_graph(NULL_REVISION))
+
+ def test_get_revision_graph_with_ghosts(self):
+ # we can get a graph object with roots, ghosts, ancestors and
+ # descendants.
+ repo = self.bzrdir.open_repository()
+ graph = repo.get_revision_graph_with_ghosts([])
+ self.assertEqual(set(['rev1']), graph.roots)
+ self.assertEqual(set(['ghost1', 'ghost2']), graph.ghosts)
+ self.assertEqual({'rev1':[],
+ 'rev2':['rev1'],
+ 'rev3':['rev2', 'ghost1'],
+ 'rev4':['rev3', 'ghost1', 'ghost2'],
+ },
+ graph.get_ancestors())
+ self.assertEqual({'ghost1':{'rev3':1, 'rev4':1},
+ 'ghost2':{'rev4':1},
+ 'rev1':{'rev2':1},
+ 'rev2':{'rev3':1},
+ 'rev3':{'rev4':1},
+ 'rev4':{},
+ },
+ graph.get_descendants())
+ # and we can ask for the NULLREVISION graph
+ graph = repo.get_revision_graph_with_ghosts([NULL_REVISION])
+ self.assertEqual({}, graph.get_ancestors())
+ self.assertEqual({}, graph.get_descendants())
+
=== added directory 'doc/design'
=== added file 'doc/design/server.txt'
--- a/doc/design/server.txt 1970-01-01 00:00:00 +0000
+++ b/doc/design/server.txt 2007-04-01 10:08:16 +0000
@@ -0,0 +1,126 @@
+====================
+Bazaar Server Design
+====================
+
+VFS Layer
+=========
+
+High-performance server
+=======================
+
+Determining missing data between client and server.
+
+Basic idea:
+Perform a recursive graph (DAG) difference algorithm incrementally.
+
+Some tools are needed. The first is 'graph_delta_description'. This generates a
+minimal description of a graph relative to a supergraph. Sketching this out: we
+could use a number of different tools to describe this: nodes to start at; a
+count of paths to avoid (if paths are numbered repeatable), nodes to search
+from, nodes to finish at; nodes that should not be included; edges to skip, and
+more. If the description will be larger than the size of the node names in the
+graph, an error should be raised.
+The functional requirements of the delta description are:
+ * It can be applied by the generator to the supergraph to recreate the graph,
+ even if the supergraph has new nodes added; once added a node and the paths
+ from it are immutable.
+ * The description can be serialised.
+The initial implementation will simply list the nodes to start from, and nodes
+that should not be traversed to. This can represent any arbitrary subgraph of
+the supergraph and does not require complex cutting logic.
+
+The second tool is 'node_sample_selection'. This takes a graph and returns a
+selection of nodes from within it along with a short token. This selection must
+have the following properties:
+ * It is repeatable given the token and the graph.
+ * The token must be short(< the size of a node) and serialisable.
+
+Constraints:
+ * The client is permitted state.
+ * The server is not permitted to hold state about a particular client across requests.
+
+Client side:
+To generate a list of the server nodes which the client does not have, the
+client asks the server for a graph_delta_description and a
+node_sample_selection. In asking for this the client provides the most recent
+graph_delta_description, node_sample_selection token, a bitmap of presence
+flags in the node_sample_selection the server supplied. The first request
+clearly has nothing to supply. The client then loops around on this until the
+server supplies a 'completed' response with a graph_delta_description and no
+node_sample_selection: at this point the server has decided that the
+computation is complete.
+
+Server side:
+When a client asks for a graph_delta_description and node_sample_selection with
+no previous description or token & bitmap, the server should create a
+current_graph which equals the supergraph, and return that and a
+node_sample_selection.
+When the client asks with a previous graph_delta_description and
+node_sample_selection & bitmap, then the server infers as much as it can about
+the client data base from the presence bitmap. In doing this it colours its
+supergraph with several labels (each node can have only one label). Present -
+the client has the node. Missing - the client does not have the node. Unknown -
+the client may have the node. If the set of Unknown nodes is empty then the
+server sends a completed message along with a graph_delta_description for the
+Missing nodes. If the set of Unknown nodes is non-empty then the server sends a
+graph_delta_description for the Unknown subgraph(s) along with a
+node_sample_selection from that. The inference proceeds by:
+ 1. Mark the entire graph unknown.
+ 2. For every node reachable from the subgraph specified by the graph delta,
+ mark it Present (it was Present in the last iteration).
+ 2. For every node derived from the subgraph specified by the graph delta, mark
+ it Missing. (it was missing in the last iteration).
+ 3. For every present node the client supplied walk from it to its parents and
+ mark the transitive closure as Present.
+ 4. For every node the client did not mark as present, find its descendants and
+ mark them as Missing. An assertion may be made at this point that they were
+ not already marked 'Present' (this would indicate ghosts on the client).
+ Alternatively we may ignore a missing statement in the ancestry of a
+ present statement, or try to infer the size of the ghosted region.
+
+
+Worked example:
+
+Server Graph: {A:[]}
+Client Graph: {}
+
+Client message 1:
+get_missing()
+response:
+unknowns=(HEADS=[], STOPS=[])
+sample=(0, [A])
+Client message 2:
+get_missing((HEADS=[], STOPS=[]), (0, [False]))
+response:
+'complete', missing=(HEADS=[], STOPS=[])
+
+
+Server Graph: {A:[], B:[A], C:[A], D:[B, C]}
+Client Graph: {A:[], B:[A], E:[B]}
+
+Client message 1:
+get_missing
+response:
+unknown=(HEADS=[], STOPS=[])
+sample=(0, [A, D])
+client message 2:
+get_missing((HEADS=[], STOPS=[]), (0, [True, False]))
+response:
+unknown=(HEADS=[B, C], STOPS=[A])
+sample=(0, [B, C])
+client message 3:
+get_missing((HEADS=[B,C], STOPS=[A]), (0, [True, False]))
+response:
+'complete', missing=(HEADS=[], STOPS=[B, A])
+
+
+
+Discussion:
+ - could the client infer the cut points from the returned bitmap ? Might save a
+ little bw but will be problematic on the second roundtrip when the sent
+ nodes does not match the entire new cutlist.
+ - The algorithm can work symmetrically, so the client can provide its own
+ node_sample_selection in each roundtrip, halving the roundtrips required to
+ determine the missing revisions.
+ - Extremely well balanced graphs may lead to very wide tails lists for the
+ node_sample_selection and graph_delta_description routines.
=== modified file 'bzrlib/errors.py'
--- a/bzrlib/errors.py 2007-04-05 09:35:26 +0000
+++ b/bzrlib/errors.py 2007-04-09 22:19:42 +0000
@@ -266,6 +266,15 @@
self.url = url
+class NotSubGraph(BzrError):
+
+ _fmt = "'%(to)s' is not a subgraph of '%(from_)s'."
+
+ def __init__(self, from_, to):
+ self.from_ = from_
+ self.to = to
+
+
class WorkingTreeAlreadyPopulated(BzrError):
_fmt = """Working tree already populated in %(base)s"""
=== modified file 'bzrlib/graph.py'
--- a/bzrlib/graph.py 2006-10-05 05:37:25 +0000
+++ b/bzrlib/graph.py 2007-04-03 07:43:02 +0000
@@ -15,6 +15,7 @@
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+from bzrlib import errors
from bzrlib.tsort import topo_sort
@@ -116,27 +117,42 @@
self.roots = set([])
self.ghosts = set([])
self._graph_ancestors = {}
- self._graph_descendants = {}
+ self._graph_descendants = None
+
+ def __eq__(self, other):
+ if not isinstance(other, Graph):
+ return False
+ if self._graph_descendants and not other._graph_descendants:
+ self._generate_descendants()
+ return self.__dict__ == other.__dict__
def add_ghost(self, node_id):
"""Add a ghost to the graph."""
self.ghosts.add(node_id)
- self._ensure_descendant(node_id)
+ if self._graph_descendants is not None:
+ self._graph_descendants.setdefault(node_id, {})
def add_node(self, node_id, parent_ids):
"""Add node_id to the graph with parent_ids as its parents."""
if parent_ids == []:
self.roots.add(node_id)
self._graph_ancestors[node_id] = list(parent_ids)
- self._ensure_descendant(node_id)
- for parent in parent_ids:
- self._ensure_descendant(parent)
- self._graph_descendants[parent][node_id] = 1
-
- def _ensure_descendant(self, node_id):
- """Ensure that a descendant lookup for node_id will work."""
- if not node_id in self._graph_descendants:
- self._graph_descendants[node_id] = {}
+ if self._graph_descendants is not None:
+ self._graph_descendants.setdefault(node_id, {})
+ for parent in parent_ids:
+ self._graph_descendants.setdefault(parent, {})[node_id] = 1
+
+ def changes_from(self, graph_from):
+ """Create a GraphDelta from graph_from to self."""
+ return GraphDelta._changes_from(graph_from, self)
+
+ def _generate_descendants(self):
+ """Construct the _graph_descendants node if needed."""
+ self._graph_descendants = {}
+ for node_id in self._graph_ancestors:
+ self._graph_descendants.setdefault(node_id, {})
+ for parent in self._graph_ancestors[node_id]:
+ self._graph_descendants.setdefault(parent, {})[node_id] = 1
def get_ancestors(self):
"""Return a dictionary of graph node:ancestor_list entries."""
@@ -159,4 +175,155 @@
def get_descendants(self):
"""Return a dictionary of graph node:child_node:distance entries."""
- return dict(self._graph_descendants.items())
+ if self._graph_descendants is None:
+ self._generate_descendants()
+ return self._graph_descendants
+
+ def get_heads(self):
+ """Determine the heads of the graph."""
+ return set([node for node, descendants in
+ self.get_descendants().items() if not descendants])
+
+ def __repr__(self):
+ return "Graph ancestors=%s, descendants=%r, ghosts=%r" % (
+ self._graph_ancestors, self.get_descendants(), self.ghosts)
+
+
+class GraphDelta(object):
+ """A delta between two DAG's.
+
+ Currently this requires that the from DAG be a supergraph of the to DAG.
+ This restriction may be lifted in the future.
+
+ Public attributes:
+ heads: A list of heads that the subgraph should start from. An empty list
+ means start from the heads of the from graph when applying the delta.
+ cut_from: A list of nodes to cut out of the output. An empty list means
+ cut nothing.
+ """
+
+ def __init__(self):
+ self.heads = set()
+ self.cut_from = set()
+
+ def __eq__(self, other):
+ return isinstance(other, GraphDelta) and self.__dict__ == other.__dict__
+
+ def apply_to(self, from_):
+ """Apply this delta to the graph from_.
+
+ :return: A new graph derived from from_ and this delta.
+ """
+ result = Graph()
+ if self.heads:
+ pending = set(self.heads)
+ else:
+ pending = set(from_.get_heads())
+ cut_from = set(self.cut_from) # in case its a list
+ ancestors = from_.get_ancestors()
+ done = set()
+ while pending:
+ node = pending.pop()
+ done.add(node)
+ if node not in cut_from:
+ try:
+ result.add_node(node, ancestors[node])
+ pending.update(set(ancestors[node]).difference(done))
+ except KeyError:
+ result.add_ghost(node)
+ return result
+
+ @classmethod
+ def _changes_from(cls, from_, to):
+ """Construct a delta from from_ to to."""
+ result = GraphDelta()
+ ancestors = from_.get_ancestors()
+ from_nodes = set(ancestors)
+ to_nodes = set(to.get_ancestors())
+ if to_nodes.difference(from_nodes):
+ raise errors.NotSubGraph(from_, to)
+ if to_nodes == from_nodes:
+ return result
+ to_heads = to.get_heads()
+ result.heads = to_heads
+ # if the heads list is equal to the from_ heads list,
+ # omit it. XXX: this special casing may be a pointless
+ # optimisation.
+ if result.heads == from_.get_heads():
+ result.heads = set([])
+
+ # start walking from all the nodes in to.
+ # when we hit a node not in to, add it to cut_from.
+ # when we hit a node in to, remove it from heads.
+ # discount ghosts in parent lists.
+ # special case, symmetrical with the apply_to use of the
+ # source graph heads when the to graph is empty:
+ # we use the source graph heads as the cut list giving
+ # an instant empty graph.
+ if not to_nodes:
+ result.cut_from = from_.get_heads()
+ return result
+ pending = set(to_heads)
+ done = set()
+ while pending:
+ node = pending.pop()
+ done.add(node)
+ node_parents = ancestors[node]
+ for parent in node_parents:
+ if parent in from_.ghosts:
+ continue
+ if parent not in to_nodes:
+ result.cut_from.add(parent)
+ continue
+ if parent in result.heads:
+ result.heads.remove(parent)
+ if parent not in done:
+ pending.add(parent)
+ # remove heads that
+ return result
+
+ def __repr__(self):
+ return "GraphDelta heads=%r, cut_from=%r" % (self.heads, self.cut_from)
+
+
+class NthNodeSelector(object):
+ """A NodeSelector that selects event nth node chosen to meet a limit.
+
+ This is almost certainly not a most-optimal node selector for smart server
+ use, but it is extremely simple to build and sufficient to test the concept.
+ """
+
+ def __init__(self, limit=1):
+ """Construct a NthNodeSelector.
+
+ :param limit: The total number of nodes that may be returned from
+ select on any graph. This count may exceed the total size of the
+ graph.
+ """
+ if limit < 1:
+ raise ValueError("limit (%r) must be >= 1" % (limit))
+ self.limit = limit
+
+ def select(self, graph):
+ """Select nodes from graph.
+
+ The nodes chosen are those found by selecting every len(graph)/limit
+ node.
+ """
+ nodes = graph.get_ancestors().keys()
+ nodes.sort()
+ count = len(nodes)
+ interval = count/self.limit
+ if interval < 1:
+ # this performs badly when 2*limit >= count, as it selects only
+ # the first limit nodes: this can be addressed by using a
+ # floating point interval, but its not done yet. TODO XXX
+ interval = 1
+ pos = int(interval * 0.5)
+ result = []
+ while pos < count:
+ if len(result) == self.limit:
+ return result
+ result.append(nodes[pos])
+ pos += interval
+ return result
=== modified file 'bzrlib/revision.py'
--- a/bzrlib/revision.py 2007-01-17 15:37:08 +0000
+++ b/bzrlib/revision.py 2007-04-03 07:43:02 +0000
@@ -253,7 +253,7 @@
[revision_a, revision_b])
# convert to a NULL_REVISION based graph.
ancestors = graph.get_ancestors()
- descendants = graph.get_descendants()
+ descendants = dict(graph.get_descendants().items())
common = set(graph.get_ancestry(revision_a)).intersection(
set(graph.get_ancestry(revision_b)))
descendants[NULL_REVISION] = {}
=== modified file 'bzrlib/tests/repository_implementations/__init__.py'
--- a/bzrlib/tests/repository_implementations/__init__.py 2007-03-13 06:55:43 +0000
+++ b/bzrlib/tests/repository_implementations/__init__.py 2007-04-09 22:19:42 +0000
@@ -52,6 +52,7 @@
'bzrlib.tests.repository_implementations.test_reconcile',
'bzrlib.tests.repository_implementations.test_repository',
'bzrlib.tests.repository_implementations.test_revision',
+ 'bzrlib.tests.repository_implementations.test_revision_graph',
'bzrlib.tests.repository_implementations.test_statistics',
]
=== modified file 'bzrlib/tests/repository_implementations/test_repository.py'
--- a/bzrlib/tests/repository_implementations/test_repository.py 2007-04-05 09:35:26 +0000
+++ b/bzrlib/tests/repository_implementations/test_repository.py 2007-04-09 22:19:42 +0000
@@ -506,6 +506,8 @@
def setUp(self):
super(TestCaseWithComplexRepository, self).setUp()
tree_a = self.make_branch_and_tree('a')
+ # save the tree for later mutations
+ self.tree_a = tree_a
self.bzrdir = tree_a.branch.bzrdir
# add a corrupt inventory 'orphan'
# this may need some generalising for knits.
@@ -524,6 +526,11 @@
tree_a.add_parent_tree_id('ghost1')
tree_a.add_parent_tree_id('ghost2')
tree_a.commit('rev4', rev_id='rev4', allow_pointless=True)
+ # the resulting graph is:
+ # rev1: [], rev2:[rev1], rev3:[rev2, ghost1], rev4[rev3, ghost1, ghost2]
+
+
+class TestRevisionTrees(TestCaseWithComplexRepository):
def test_revision_trees(self):
revision_ids = ['rev1', 'rev2', 'rev3', 'rev4']
@@ -543,6 +550,9 @@
revisions]
assert deltas1 == deltas2
+
+class TestAllRevisionIds(TestCaseWithComplexRepository):
+
def test_all_revision_ids(self):
# all_revision_ids -> all revisions
self.assertEqual(['rev1', 'rev2', 'rev3', 'rev4'],
@@ -554,58 +564,6 @@
self.assertRaises(errors.NoSuchRevision,
self.bzrdir.open_repository().get_ancestry, 'orphan')
- def test_get_revision_graph(self):
- # we can get a mapping of id->parents for the entire revision graph or bits thereof.
- self.assertEqual({'rev1':[],
- 'rev2':['rev1'],
- 'rev3':['rev2'],
- 'rev4':['rev3'],
- },
- self.bzrdir.open_repository().get_revision_graph(None))
- self.assertEqual({'rev1':[]},
- self.bzrdir.open_repository().get_revision_graph('rev1'))
- self.assertEqual({'rev1':[],
- 'rev2':['rev1']},
- self.bzrdir.open_repository().get_revision_graph('rev2'))
- self.assertRaises(NoSuchRevision,
- self.bzrdir.open_repository().get_revision_graph,
- 'orphan')
- # and ghosts are not mentioned
- self.assertEqual({'rev1':[],
- 'rev2':['rev1'],
- 'rev3':['rev2'],
- },
- self.bzrdir.open_repository().get_revision_graph('rev3'))
- # and we can ask for the NULLREVISION graph
- self.assertEqual({},
- self.bzrdir.open_repository().get_revision_graph(NULL_REVISION))
-
- def test_get_revision_graph_with_ghosts(self):
- # we can get a graph object with roots, ghosts, ancestors and
- # descendants.
- repo = self.bzrdir.open_repository()
- graph = repo.get_revision_graph_with_ghosts([])
- self.assertEqual(set(['rev1']), graph.roots)
- self.assertEqual(set(['ghost1', 'ghost2']), graph.ghosts)
- self.assertEqual({'rev1':[],
- 'rev2':['rev1'],
- 'rev3':['rev2', 'ghost1'],
- 'rev4':['rev3', 'ghost1', 'ghost2'],
- },
- graph.get_ancestors())
- self.assertEqual({'ghost1':{'rev3':1, 'rev4':1},
- 'ghost2':{'rev4':1},
- 'rev1':{'rev2':1},
- 'rev2':{'rev3':1},
- 'rev3':{'rev4':1},
- 'rev4':{},
- },
- graph.get_descendants())
- # and we can ask for the NULLREVISION graph
- graph = repo.get_revision_graph_with_ghosts([NULL_REVISION])
- self.assertEqual({}, graph.get_ancestors())
- self.assertEqual({}, graph.get_descendants())
-
def test_reserved_id(self):
repo = self.make_repository('repository')
self.assertRaises(errors.ReservedId, repo.add_inventory, 'reserved:',
=== modified file 'bzrlib/tests/test_errors.py'
--- a/bzrlib/tests/test_errors.py 2007-03-30 01:46:31 +0000
+++ b/bzrlib/tests/test_errors.py 2007-04-09 22:19:42 +0000
@@ -104,6 +104,10 @@
str(error))
self.assertIsInstance(error, errors.NoSuchRevision)
+ def test_not_sub_graph(self):
+ error = errors.NotSubGraph("from", "to")
+ self.assertEqualDiff("'to' is not a subgraph of 'from'.", str(error))
+
def test_not_write_locked(self):
error = errors.NotWriteLocked('a thing to repr')
self.assertEqualDiff("'a thing to repr' is not write locked but needs "
=== modified file 'bzrlib/tests/test_graph.py'
--- a/bzrlib/tests/test_graph.py 2006-10-11 23:08:27 +0000
+++ b/bzrlib/tests/test_graph.py 2007-04-02 07:28:50 +0000
@@ -14,8 +14,14 @@
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+from bzrlib.errors import NotSubGraph
from bzrlib.tests import TestCase
-from bzrlib.graph import node_distances, nodes_by_distance, Graph
+from bzrlib.graph import (node_distances,
+ nodes_by_distance,
+ Graph,
+ GraphDelta,
+ NthNodeSelector,
+ )
class TestBase(TestCase):
@@ -78,3 +84,384 @@
# check the result contains ghosts:
self.assertEqual({'ghost': {'rev1': 1}, 'rev1': {}},
graph.get_descendants())
+
+ def test_get_heads(self):
+ # graph objects allow querying for their heads.
+ # test empty graph, a partitioned graph, and a graph
+ # with a unmerged branch
+ # empty
+ graph = Graph()
+ self.assertEqual(set(), graph.get_heads())
+ # simple
+ graph.add_node('A', [])
+ self.assertEqual(set(['A']), graph.get_heads())
+ graph.add_node('B', ['A'])
+ self.assertEqual(set(['B']), graph.get_heads())
+ # simple, partitioned
+ graph.add_node('C', [])
+ self.assertEqual(set(['B', 'C']), graph.get_heads())
+ graph.add_node('D', ['C'])
+ self.assertEqual(set(['B', 'D']), graph.get_heads())
+ # joined up parititons
+ graph.add_node('E', ['D', 'B'])
+ self.assertEqual(set(['E']), graph.get_heads())
+ # unmerged branch
+ graph.add_node('F', ['B'])
+ self.assertEqual(set(['E', 'F']), graph.get_heads())
+ # and merge it
+ graph.add_node('G', ['E', 'F'])
+ self.assertEqual(set(['G']), graph.get_heads())
+
+
+class TestGraphDelta(TestCase):
+
+ def check_delta(self, expected_delta, graph_from, graph_to):
+ """Check the delta from graph_from to graph_to.
+
+ This will calculate a new delta, check that applying it to graph_from
+ results in graph_to, and that it is equal to expected_delta.
+ """
+ delta = graph_to.changes_from(graph_from)
+ self.assertIsInstance(delta, GraphDelta)
+ self.assertEqual(expected_delta, delta)
+ new_to = delta.apply_to(graph_from)
+ self.assertEqual(graph_to, new_to)
+
+ def testEmptyEmptyGraph(self):
+ graph_from = Graph()
+ graph_to = Graph()
+ self.check_delta(GraphDelta(), graph_from, graph_to)
+
+ def testTargetNotSubGraph(self):
+ """When the target is not a sub-graph of the source, get an exception."""
+ graph_from = Graph()
+ graph_to = Graph()
+ graph_to.add_node('A', [])
+ self.assertRaises(NotSubGraph, graph_to.changes_from, graph_from)
+
+ def testSmallEmptyGraph(self):
+ graph_from = Graph()
+ graph_from.add_node('A', [])
+ graph_to = Graph()
+ result = GraphDelta()
+ # keeping the implied heads lets the graph-to-build be maintained
+ # outside the core routine; this is cleaner I think for now.
+ result.cut_from = set(['A'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testSmallIdentityGraph(self):
+ graph_from = Graph()
+ graph_from.add_node('A', [])
+ graph_to = Graph()
+ graph_to.add_node('A', [])
+ result = GraphDelta()
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegment(self):
+ # grabbing a delta from A->B->C->D->E to B->C->D:
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C'])
+ graph_from.add_node('C', ['D'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', [])
+ graph_to = Graph()
+ graph_to.add_node('B', ['C'])
+ graph_to.add_node('C', ['D'])
+ graph_to.add_node('D', ['E'])
+ result = GraphDelta()
+ result.heads = set(['B'])
+ result.cut_from = set(['E'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegmentWithMerges(self):
+ # grabbing a delta where the to graph is totally interior and contains embedded merges.
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C', 'D'])
+ graph_from.add_node('C', ['E'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', ['F'])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('B', ['C', 'D'])
+ graph_to.add_node('C', ['E'])
+ graph_to.add_node('D', ['E'])
+ graph_to.add_node('E', ['F'])
+ result = GraphDelta()
+ result.heads = set(['B'])
+ result.cut_from = set(['F'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegmentWithMergeStart(self):
+ # grabbing a delta where the to graph is totally interior and contains the start
+ # of a branch in the graph
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C', 'D'])
+ graph_from.add_node('C', ['E'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', ['F'])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('C', ['E'])
+ graph_to.add_node('D', ['E'])
+ graph_to.add_node('E', ['F'])
+ result = GraphDelta()
+ result.heads = set(['C', 'D'])
+ result.cut_from = set(['F'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegmentWithMergeEnd(self):
+ # grabbing a delta where the to graph is totally interior and contains the end
+ # of a branch in the graph
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C', 'D'])
+ graph_from.add_node('C', ['E'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', ['F'])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('B', ['C', 'D'])
+ result = GraphDelta()
+ result.heads = set(['B'])
+ result.cut_from = set(['C', 'D'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegmentWithPartitionedMerge(self):
+ # grabbing a delta where the to graph is totally interior and contains both
+ # sides of a branch and merge, but not the start or end.
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C', 'D'])
+ graph_from.add_node('C', ['E'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', ['F'])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('C', ['E'])
+ graph_to.add_node('D', ['E'])
+ result = GraphDelta()
+ result.heads = set(['C', 'D'])
+ result.cut_from = set(['E'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegmentWithLeftSideMerge(self):
+ # grabbing a delta where the to graph is totally interior and contains the left
+ # side of a branch and merge but not the right side.
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C', 'D'])
+ graph_from.add_node('C', ['E'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', ['F'])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('C', ['E'])
+ graph_to.add_node('E', ['F'])
+ result = GraphDelta()
+ result.heads = set(['C'])
+ result.cut_from = set(['F'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testMiddleSegmentWithRightSideMerge(self):
+ # grabbing a delta where the to graph is totally interior and contains the right
+ # side of a branch and merge but not the left side.
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C', 'D'])
+ graph_from.add_node('C', ['E'])
+ graph_from.add_node('D', ['E'])
+ graph_from.add_node('E', ['F'])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('D', ['E'])
+ graph_to.add_node('E', ['F'])
+ result = GraphDelta()
+ result.heads = set(['D'])
+ result.cut_from = set(['F'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testEndOfGraph(self):
+ # grabbing a delta from A->B->C->D->E,F to C->D->E,F:
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C'])
+ graph_from.add_node('C', ['D'])
+ graph_from.add_node('D', ['E', 'F'])
+ graph_from.add_node('E', [])
+ graph_from.add_node('F', [])
+ graph_to = Graph()
+ graph_to.add_node('C', ['D'])
+ graph_to.add_node('D', ['E', 'F'])
+ graph_to.add_node('E', [])
+ graph_to.add_node('F', [])
+ result = GraphDelta()
+ result.heads = set(['C'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testPartitionedEndOfGraph(self):
+ # grabbing a delta from A->B->C->D,E D->F, E->G to D->F and E->G :
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C'])
+ graph_from.add_node('C', ['D', 'E'])
+ graph_from.add_node('D', ['F'])
+ graph_from.add_node('E', ['G'])
+ graph_from.add_node('F', [])
+ graph_from.add_node('G', [])
+ graph_to = Graph()
+ graph_to.add_node('D', ['F'])
+ graph_to.add_node('E', ['G'])
+ graph_to.add_node('F', [])
+ graph_to.add_node('G', [])
+ result = GraphDelta()
+ result.heads = set(['D', 'E'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testStartOfGraph(self):
+ # grabbing a delta from A->B->C to A->B:
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_node('B', ['C'])
+ graph_from.add_node('C', [])
+ graph_to = Graph()
+ graph_to.add_node('A', ['B'])
+ graph_to.add_node('B', ['C'])
+ result = GraphDelta()
+ result.cut_from = set(['C'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testStartOfGraphMultipleHeads(self):
+ # grabbing a delta which grabs the top of a graph with multiple hedas.
+ graph_from = Graph()
+ graph_from.add_node('A1', ['B1'])
+ graph_from.add_node('A2', ['B2'])
+ graph_from.add_node('B1', ['C'])
+ graph_from.add_node('B2', ['C'])
+ graph_from.add_node('C', [])
+ graph_to = Graph()
+ graph_to.add_node('A1', ['B1'])
+ graph_to.add_node('A2', ['B2'])
+ graph_to.add_node('B1', ['C'])
+ graph_to.add_node('B2', ['C'])
+ result = GraphDelta()
+ result.cut_from = set(['C'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testStartOfPartitionedGraph(self):
+ # grabbing a delta which grabs the top of a partitioned graph.
+ graph_from = Graph()
+ graph_from.add_node('A1', ['B1'])
+ graph_from.add_node('A2', ['B2'])
+ graph_from.add_node('B1', ['C1'])
+ graph_from.add_node('B2', ['C2'])
+ graph_from.add_node('C1', [])
+ graph_from.add_node('C2', [])
+ graph_to = Graph()
+ graph_to.add_node('A1', ['B1'])
+ graph_to.add_node('A2', ['B2'])
+ graph_to.add_node('B1', ['C1'])
+ graph_to.add_node('B2', ['C2'])
+ result = GraphDelta()
+ result.cut_from = set(['C1', 'C2'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testOneSideOfPartitionedGraph(self):
+ # grabbing a delta which grabs one side of a partitioned graph.
+ graph_from = Graph()
+ graph_from.add_node('A1', ['B1'])
+ graph_from.add_node('A2', ['B2'])
+ graph_from.add_node('B1', ['C1'])
+ graph_from.add_node('B2', ['C2'])
+ graph_from.add_node('C1', [])
+ graph_from.add_node('C2', [])
+ graph_to = Graph()
+ graph_to.add_node('A1', ['B1'])
+ graph_to.add_node('B1', ['C1'])
+ result = GraphDelta()
+ result.heads = set(['A1'])
+ # we only need to cut at C1 because the heads list removes the other
+ # partition implicitly.
+ result.cut_from = set(['C1'])
+ self.check_delta(result, graph_from, graph_to)
+
+ def testApplyToGraphWithGhosts(self):
+ # applying a delta when the graph has ghosts should not error.
+ graph_from = Graph()
+ graph_from.add_node('A', ['B'])
+ graph_from.add_ghost('B')
+ graph_to = Graph()
+ graph_to.add_node('A', ['B'])
+ graph_to.add_ghost('B')
+ result = GraphDelta()
+ self.check_delta(result, graph_from, graph_to)
+
+
+class TestNodeSelector(TestCase):
+
+ def check_selection(self, expected_selection, graph, selector):
+ """Check the node selection from graph using selector.
+
+ This will generate a new selection and check its identical to the
+ expected selection.
+ """
+ selection = selector.select(graph)
+ self.assertIsInstance(selection, list)
+ self.assertEqual(expected_selection, selection)
+
+ def test_constructor_bounds(self):
+ """NthNodeSelector cannot be constructed with a limit of 0."""
+ self.assertRaises(ValueError, NthNodeSelector, 0)
+
+ def test_nth_empty(self):
+ # the nth selector on an empty graph -> []
+ graph = Graph()
+ selector = NthNodeSelector()
+ self.check_selection([], graph, NthNodeSelector(1))
+ self.check_selection([], graph, NthNodeSelector(2))
+
+ def test_nth_one_node(self):
+ # the nth selector on {A:[B]} -> [A]
+ graph = Graph()
+ graph.add_node('A', ['B'])
+ self.check_selection(['A'], graph, NthNodeSelector(1))
+ self.check_selection(['A'], graph, NthNodeSelector(2))
+ self.check_selection(['A'], graph, NthNodeSelector(200))
+
+ def test_nth_two_nodes(self):
+ # the nth selector on {A:[B], C:[B]}
+ graph = Graph()
+ graph.add_node('A', ['B'])
+ graph.add_node('C', ['B'])
+ selector = NthNodeSelector()
+ self.check_selection(['C'], graph, NthNodeSelector(1))
+ self.check_selection(['A', 'C'], graph, NthNodeSelector(2))
+ self.check_selection(['A', 'C'], graph, NthNodeSelector(200))
+
+ def test_nth_three_nodes(self):
+ # the nth selector on {A:[B], C:[B], B:[]}
+ graph = Graph()
+ graph.add_node('A', ['B'])
+ graph.add_node('B', [])
+ graph.add_node('C', ['B'])
+ selector = NthNodeSelector()
+ self.check_selection(['B'], graph, NthNodeSelector(1))
+ self.check_selection(['A', 'B'], graph, NthNodeSelector(2))
+ self.check_selection(['A', 'B', 'C'], graph, NthNodeSelector(3))
+ self.check_selection(['A', 'B', 'C'], graph, NthNodeSelector(200))
+
+ def test_nth_four_nodes(self):
+ # the nth selector on {A:[B], C:[B], B:[], E:[A]}
+ graph = Graph()
+ graph.add_node('A', ['B'])
+ graph.add_node('B', [])
+ graph.add_node('C', ['B'])
+ graph.add_node('E', ['A'])
+ selector = NthNodeSelector()
+ self.check_selection(['C'], graph, NthNodeSelector(1))
+ self.check_selection(['B', 'E'], graph, NthNodeSelector(2))
+ self.check_selection(['A', 'B', 'C'], graph, NthNodeSelector(3))
+ self.check_selection(['A', 'B', 'C', 'E'], graph, NthNodeSelector(4))
+ self.check_selection(['A', 'B', 'C', 'E'], graph, NthNodeSelector(200))
More information about the bazaar-commits
mailing list