[MERGE] reconcile - aka fix - with tests

John Arbash Meinel john at arbash-meinel.com
Thu Feb 23 15:29:20 GMT 2006


Robert Collins wrote:
> $ diffstat reconcile.patch 
>  builtins.py                                        |   26 ++++
>  commit.py                                          |   44 ++-----
>  reconcile.py                                       |  119
> +++++++++++++++++++++
>  repository.py                                      |   49 ++++++++
>  store/weave.py                                     |    2 
>  tests/blackbox/__init__.py                         |   21 +++
>  tests/blackbox/test_reconcile.py                   |   51 +++++++++
>  tests/blackbox/test_upgrade.py                     |   19 ---
>  tests/repository_implementations/__init__.py       |    1 
>  tests/repository_implementations/test_reconcile.py |  106
> ++++++++++++++++++
>  10 files changed, 389 insertions(+), 49 deletions(-)
> $ wc -l reconcile.patch 
> 575 reconcile.patch
> 
> This patch adds the start of the planned reconcile command. The first
> thing it does is a reweave. In the future it will do root id
> reconciliation at the same time, and hopefully we can parallise all the
> actions.
> 
> Rob
> 
> 
> 
> ------------------------------------------------------------------------
> 
> === added file 'bzrlib/reconcile.py'
> --- /dev/null	
> +++ bzrlib/reconcile.py	
> @@ -0,0 +1,119 @@
> +# (C) 2005, 2006 Canonical Limited.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> +
> +"""Plugin to fix some potential data errors in a branch.
> +
> +This makes sure that the inventory weave's DAG of ancestry is correct so that
> +attempts to fetch the branch over http, or certain merge operations cope
> +correctly.
> +
> +This is most likely needed if you have used fetch-ghosts from bzrlib to
> +resolve ghosts after a baz (or otherwise) import and you get bizarre behaviour
> +when either exporting over http or when merging from other translated branches.
> +"""
> +
> +
> +import os
> +
> +
> +import bzrlib.branch
> +import bzrlib.progress
> +from bzrlib.trace import mutter
> +import bzrlib.ui as ui
> +from bzrlib.weavefile import write_weave_v5 as w5
> +
> +
> +
> +def reconcile(dir):
> +    """Reconcile the data in dir.
> +
> +    Currently this is limited to a inventory 'reweave'.
> +
> +    This is a convenience method, and the public api, for using a 
> +    Reconciler object.
> +    """
> +    reconciler = Reconciler(dir)
> +    reconciler.reconcile()
> +
> +
> +class Reconciler(object):
> +    """Reconcilers are used to reconcile existing data.
> +
> +    Currently this is limited to a single repository, and consists
> +    of an inventory reweave with revision cross-checks.
> +    """
> +
> +    def __init__(self, dir):
> +        self.bzrdir = dir
> +
> +    def reconcile(self):
> +        """Actually perform the reconciliation."""
> +        self.pb = ui.ui_factory.progress_bar()
> +        self.repo = self.bzrdir.open_repository()
> +        self.repo.lock_write()
> +        try:
> +            self.pb.note('Reconciling repository %s',
> +                         self.repo.bzrdir.root_transport.base)
> +            self._reweave_inventory()
> +        finally:
> +            self.repo.unlock()
> +        self.pb.note('Reconciliation complete.')
> +

It seems like it would be better to check if the inventory needs to be
reweaved, which is a moderate expense, rather than always reweaving,
which is very expensive. It also gives you a chance to load all of the
revision texts into memory, and generate the ancestry graph, so that you
know which inventories need to be added first.

Also, if we are working on inventory by inventory, I would recommend
that we check the weaves/* themselves to make sure they contain what the
inventory references.

> +    def _reweave_inventory(self):
> +        """Regenerate the inventory weave for the repository from scratch."""
> +        self.pb.update('Reading inventory data.')
> +        inventory = self.repo.get_inventory_weave()
> +        self.repo.control_weaves.put_weave('inventory.backup',
> +                                           inventory,
> +                                           self.repo.get_transaction())
> +        self.pb.note('Backup Inventory created.')
> +        # asking for '' should never return a non-empty weave
> +        new_inventory = self.repo.control_weaves.get_weave_or_empty('',
> +            self.repo.get_transaction())
> +
> +        pending = [file_id for file_id in self.repo.revision_store]
> +        self.total = len(pending)
> +        self.count = 0
> +
> +        # FIXME this loop is potentially N + N-1 + N-2 etc loops,
> +        # which is rather terrible. Better to make a stack of the
> +        # current desired-for-availability revisions and do those 
> +        # preferentially. RBC 20060223
> +        while pending:
> +            rev_id = pending.pop(0)
> +
> +            self.pb.update('regenerating', self.count, self.total)
> +
> +            rev = self.repo.get_revision(rev_id)
> +            parents = []
> +            for parent in rev.parent_ids:
> +                if parent in inventory:
> +                    parents.append(parent)
> +                else:
> +                    mutter('found ghost %s', parent)
> +            unavailable = [p for p in parents if p not in new_inventory]
> +            if len(unavailable) == 0:
> +                new_inventory.add(rev_id, parents, inventory.get(rev_id))
> +                self.count += 1
> +            else:
> +                pending.append(rev_id)
> +
> +                
> +        self.pb.update('Writing weave')
> +        self.repo.control_weaves.put_weave('inventory',
> +                                           new_inventory,
> +                                           self.repo.get_transaction())
> +        self.pb.note('Inventory regenerated.')
> 
> === added file 'bzrlib/tests/blackbox/test_reconcile.py'
> --- /dev/null	
> +++ bzrlib/tests/blackbox/test_reconcile.py	
> @@ -0,0 +1,51 @@
> +# Copyright (C) 2006 by Canonical Ltd
> +# Authors: Robert Collins <robert.collins at canonical.com>
> +# -*- coding: utf-8 -*-
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> +
> +"""Black box tests for the reconcile command."""
> +
> +
> +import bzrlib.bzrdir as bzrdir
> +import bzrlib.repository as repository
> +from bzrlib.tests import TestCaseWithTransport
> +from bzrlib.tests.blackbox import TestUIFactory
> +from bzrlib.transport import get_transport
> +import bzrlib.ui as ui
> +
> +
> +class TrivialTest(TestCaseWithTransport):
> +
> +    def setUp(self):
> +        super(TrivialTest, self).setUp()
> +        self.old_format = bzrdir.BzrDirFormat.get_default_format()
> +        self.old_ui_factory = ui.ui_factory
> +        self.addCleanup(self.restoreDefaults)
> +        ui.ui_factory = TestUIFactory()
> +
> +    def restoreDefaults(self):
> +        ui.ui_factory = self.old_ui_factory
> +
> +    def test_trivial_reconcile(self):
> +        t = bzrdir.BzrDir.create_standalone_workingtree('.')
> +        (out, err) = self.run_bzr_captured(['reconcile'])
> +        self.assertEqualDiff(out, "Reconciling repository %s\n"
> +                                  "Backup Inventory created.\n"
> +                                  "Inventory regenerated.\n"
> +                                  "Reconciliation complete.\n" %
> +                                  t.bzrdir.root_transport.base)
> +        self.assertEqualDiff(err, "")
> +
> 
> === added file 'bzrlib/tests/repository_implementations/test_reconcile.py'
> --- /dev/null	
> +++ bzrlib/tests/repository_implementations/test_reconcile.py	
> @@ -0,0 +1,106 @@
> +# Copyright (C) 2006 by Canonical Ltd
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write to the Free Software
> +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> +
> +"""Tests for reconiliation of repositories."""
> +
> +
> +import bzrlib
> +import bzrlib.errors as errors
> +from bzrlib.reconcile import reconcile
> +from bzrlib.revision import Revision
> +from bzrlib.tests.repository_implementations.test_repository import TestCaseWithRepository
> +from bzrlib.transport import get_transport
> +from bzrlib.tree import EmptyTree
> +from bzrlib.workingtree import WorkingTree
> +
> +
> +class TestNeedingReweave(TestCaseWithRepository):
> +
> +    def setUp(self):
> +        super(TestNeedingReweave, self).setUp()
> +        
> +        t = get_transport(self.get_url())
> +        # an empty inventory with no revision for testing with.
> +        repo = self.make_repository('inventory_no_revision')
> +        inv = EmptyTree().inventory
> +        repo.add_inventory('missing', inv, [])
> +
> +        # a inventory with no parents and the revision has parents..
> +        # i.e. a ghost.
> +        t.copy_tree('inventory_no_revision', 'inventory_one_ghost')
> +        repo = bzrlib.repository.Repository.open('inventory_one_ghost')
> +        sha1 = repo.add_inventory('ghost', inv, [])
> +        rev = Revision(timestamp=0,
> +                       timezone=None,
> +                       committer="Foo Bar <foo at example.com>",
> +                       message="Message",
> +                       inventory_sha1=sha1,
> +                       revision_id='ghost')
> +    

...

> -import bzrlib.gpg as gpg
>  from bzrlib.revision import Revision
>  from bzrlib.testament import Testament
>  from bzrlib.trace import mutter, note, warning
> @@ -295,7 +294,11 @@
>              if len(list(self.work_tree.iter_conflicts()))>0:
>                  raise ConflictsInTree
>  
> -            self._record_inventory()
> +            self.inv_sha1 = self.branch.repository.add_inventory(
> +                self.rev_id,
> +                self.new_inv,
> +                self.present_parents
> +                )
>              self._make_revision()
>              self.work_tree.set_pending_merges([])
>              self.branch.append_revision(self.rev_id)

This looks good. Though I thought we were looking to use the Testament
sha1 in Revision entries.


In general, +1, though it would be nice if you checked to make sure a
reweave was necessary before actually doing the reweave.

(And if you load the revisions ahead of time, you can do "topo_sort" to
make sure they all get added ancestors first).

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060223/79ef59c1/attachment.pgp 


More information about the bazaar mailing list