Rev 2682: (robertc) Introduce write_groups for repositories, allowing repositories with the physical ability to do transactional data insertion to have that modelled within bzr. (Robert Collins). in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Wed Aug 8 01:28:24 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2682
revision-id: pqm at pqm.ubuntu.com-20070808002810-703n3mr6b6hwataj
parent: pqm at pqm.ubuntu.com-20070807225230-7gb6fot3mnsxp7zs
parent: robertc at robertcollins.net-20070807225945-dlxppeb3we4lh897
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Wed 2007-08-08 01:28:10 +0100
message:
  (robertc) Introduce write_groups for repositories, allowing repositories with the physical ability to do transactional data insertion to have that modelled within bzr. (Robert Collins).
added:
  bzrlib/tests/repository_implementations/test_write_group.py test_write_group.py-20070716105516-89n34xtogq5frn0m-1
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/commit.py               commit.py-20050511101309-79ec1a0168e0e825
  bzrlib/fetch.py                fetch.py-20050818234941-26fea6105696365d
  bzrlib/remote.py               remote.py-20060720103555-yeeg2x51vn0rbtdp-1
  bzrlib/repofmt/knitrepo.py     knitrepo.py-20070206081537-pyy4a00xdas0j4pf-1
  bzrlib/repository.py           rev_storage.py-20051111201905-119e9401e46257e3
  bzrlib/tests/branch_implementations/test_branch.py testbranch.py-20050711070244-121d632bc37d7253
  bzrlib/tests/interrepository_implementations/test_interrepository.py test_interrepository.py-20060220061411-1ec13fa99e5e3eee
  bzrlib/tests/repository_implementations/__init__.py __init__.py-20060131092037-9564957a7d4a841b
  bzrlib/tests/repository_implementations/test_commit_builder.py test_commit_builder.py-20060606110838-76e3ra5slucqus81-1
  bzrlib/tests/repository_implementations/test_reconcile.py test_reconcile.py-20060223022332-572ef70a3288e369
  bzrlib/tests/repository_implementations/test_repository.py test_repository.py-20060131092128-ad07f494f5c9d26c
  bzrlib/tests/workingtree_implementations/test_inv.py test_inv.py-20070311221604-ighlq8tbn5xq0kuo-1
  doc/developers/repository.txt  repository.txt-20070709152006-xkhlek456eclha4u-1
    ------------------------------------------------------------
    revno: 2617.6.9
    merged: robertc at robertcollins.net-20070807225945-dlxppeb3we4lh897
    parent: robertc at robertcollins.net-20070801002037-cjf6ja1y3zzw5f2e
    parent: pqm at pqm.ubuntu.com-20070807225230-7gb6fot3mnsxp7zs
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: integration
    timestamp: Wed 2007-08-08 08:59:45 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2617.6.8
    merged: robertc at robertcollins.net-20070801002037-cjf6ja1y3zzw5f2e
    parent: robertc at robertcollins.net-20070731005340-g78eko5xn23lhpq3
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Wed 2007-08-01 10:20:37 +1000
    message:
      Review feedback and documentation.
    ------------------------------------------------------------
    revno: 2617.6.7
    merged: robertc at robertcollins.net-20070731005340-g78eko5xn23lhpq3
    parent: robertc at robertcollins.net-20070731005211-vcncoanpgu8qnu03
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Tue 2007-07-31 10:53:40 +1000
    message:
      More review feedback.
    ------------------------------------------------------------
    revno: 2617.6.6
    merged: robertc at robertcollins.net-20070731005211-vcncoanpgu8qnu03
    parent: robertc at robertcollins.net-20070720022713-z6x6cns4dy1mzhpk
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Tue 2007-07-31 10:52:11 +1000
    message:
      Some review feedback.
    ------------------------------------------------------------
    revno: 2617.6.5
    merged: robertc at robertcollins.net-20070720022713-z6x6cns4dy1mzhpk
    parent: robertc at robertcollins.net-20070719002827-t1pu9alpneh3k966
    parent: pqm at pqm.ubuntu.com-20070720015347-eaeqmggngaemmbde
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Fri 2007-07-20 12:27:13 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2617.6.4
    merged: robertc at robertcollins.net-20070719002827-t1pu9alpneh3k966
    parent: robertc at robertcollins.net-20070718234410-sa56x9ui5la405jf
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Thu 2007-07-19 10:28:27 +1000
    message:
      Update tests with things that break when a repository requires write groups to be used.
    ------------------------------------------------------------
    revno: 2617.6.3
    merged: robertc at robertcollins.net-20070718234410-sa56x9ui5la405jf
    parent: robertc at robertcollins.net-20070717160400-l1ng2wgmldsttebk
    parent: pqm at pqm.ubuntu.com-20070717110203-zzmtp28nunhsoz12
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Thu 2007-07-19 09:44:10 +1000
    message:
      Merge bzr.dev.
    ------------------------------------------------------------
    revno: 2617.6.2
    merged: robertc at robertcollins.net-20070717160400-l1ng2wgmldsttebk
    parent: robertc at robertcollins.net-20070716110120-9we93ynxjza948vd
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Wed 2007-07-18 02:04:00 +1000
    message:
      Add abort_write_group and wire write_groups into fetch and commit.
    ------------------------------------------------------------
    revno: 2617.6.1
    merged: robertc at robertcollins.net-20070716110120-9we93ynxjza948vd
    parent: pqm at pqm.ubuntu.com-20070713074627-93zxs9uh528y0fki
    committer: Robert Collins <robertc at robertcollins.net>
    branch nick: repo-write-group
    timestamp: Mon 2007-07-16 21:01:20 +1000
    message:
      * New method on Repository - ``start_write_group``, ``end_write_group``
        and ``is_in_write_group`` - which provide a clean hook point for 
        transactional Repositories - ones where all the data for a fetch or
        commit needs to be made atomically available in one step. This allows
        the write lock to remain while making a series of data insertions.
        (e.g. data conversion). (Robert Collins)
=== added file 'bzrlib/tests/repository_implementations/test_write_group.py'
--- a/bzrlib/tests/repository_implementations/test_write_group.py	1970-01-01 00:00:00 +0000
+++ b/bzrlib/tests/repository_implementations/test_write_group.py	2007-07-17 16:04:00 +0000
@@ -0,0 +1,96 @@
+# Copyright (C) 2007 Canonical Ltd
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+
+"""Tests for repository write groups."""
+
+from bzrlib import errors
+from bzrlib.tests.repository_implementations.test_repository import TestCaseWithRepository
+
+
+class TestWriteGroup(TestCaseWithRepository):
+
+    def test_start_write_group_unlocked_needs_write_lock(self):
+        repo = self.make_repository('.')
+        self.assertRaises(errors.NotWriteLocked, repo.start_write_group)
+
+    def test_start_write_group_read_locked_needs_write_lock(self):
+        repo = self.make_repository('.')
+        repo.lock_read()
+        try:
+            self.assertRaises(errors.NotWriteLocked, repo.start_write_group)
+        finally:
+            repo.unlock()
+
+    def test_start_write_group_write_locked_gets_None(self):
+        repo = self.make_repository('.')
+        repo.lock_write()
+        self.assertEqual(None, repo.start_write_group())
+        repo.commit_write_group()
+        repo.unlock()
+
+    def test_start_write_group_twice_errors(self):
+        repo = self.make_repository('.')
+        repo.lock_write()
+        repo.start_write_group()
+        try:
+            # don't need a specific exception for now - this is 
+            # really to be sure it's used right, not for signalling
+            # semantic information.
+            self.assertRaises(errors.BzrError, repo.start_write_group)
+        finally:
+            repo.commit_write_group()
+            repo.unlock()
+
+    def test_commit_write_group_gets_None(self):
+        repo = self.make_repository('.')
+        repo.lock_write()
+        repo.start_write_group()
+        self.assertEqual(None, repo.commit_write_group())
+        repo.unlock()
+
+    def test_unlock_after_start_errors(self):
+        repo = self.make_repository('.')
+        repo.lock_write()
+        repo.start_write_group()
+        # don't need a specific exception for now - this is 
+        # really to be sure it's used right, not for signalling
+        # semantic information.
+        self.assertRaises(errors.BzrError, repo.unlock)
+        self.assertTrue(repo.is_locked())
+        repo.commit_write_group()
+        repo.unlock()
+
+    def test_is_in_write_group(self):
+        repo = self.make_repository('.')
+        self.assertFalse(repo.is_in_write_group())
+        repo.lock_write()
+        repo.start_write_group()
+        self.assertTrue(repo.is_in_write_group())
+        repo.commit_write_group()
+        self.assertFalse(repo.is_in_write_group())
+        # abort also removes the in_write_group status.
+        repo.start_write_group()
+        self.assertTrue(repo.is_in_write_group())
+        repo.abort_write_group()
+        self.assertFalse(repo.is_in_write_group())
+        repo.unlock()
+
+    def test_abort_write_group_gets_None(self):
+        repo = self.make_repository('.')
+        repo.lock_write()
+        repo.start_write_group()
+        self.assertEqual(None, repo.abort_write_group())
+        repo.unlock()

=== modified file 'NEWS'
--- a/NEWS	2007-08-07 20:01:09 +0000
+++ b/NEWS	2007-08-07 22:59:45 +0000
@@ -214,6 +214,13 @@
     * InterTree.compare now passes require_versioned on correctly.
       (Marius Kruger)
 
+    * New methods on Repository - ``start_write_group``,
+      ``commit_write_group``, ``abort_write_group`` and ``is_in_write_group`` -
+      which provide a clean hook point for transactional Repositories - ones
+      where all the data for a fetch or commit needs to be made atomically
+      available in one step. This allows the write lock to remain while making
+      a series of data insertions.  (e.g. data conversion). (Robert Collins)
+
   TESTING:
 
     * Remove selftest ``--clean-output``, ``--numbered-dirs`` and

=== modified file 'bzrlib/commit.py'
--- a/bzrlib/commit.py	2007-07-23 03:57:11 +0000
+++ b/bzrlib/commit.py	2007-08-07 22:59:45 +0000
@@ -296,25 +296,30 @@
                     entries_title="Directory")
             self.builder = self.branch.get_commit_builder(self.parents,
                 self.config, timestamp, timezone, committer, revprops, rev_id)
-            self._update_builder_with_changes()
-            self._check_pointless()
-
-            # TODO: Now the new inventory is known, check for conflicts.
-            # ADHB 2006-08-08: If this is done, populate_new_inv should not add
-            # weave lines, because nothing should be recorded until it is known
-            # that commit will succeed.
-            self._set_progress_stage("Saving data locally")
-            self.builder.finish_inventory()
-
-            # Prompt the user for a commit message if none provided
-            message = message_callback(self)
-            assert isinstance(message, unicode), type(message)
-            self.message = message
-            self._escape_commit_message()
-
-            # Add revision data to the local branch
-            self.rev_id = self.builder.commit(self.message)
-            
+            try:
+                self._update_builder_with_changes()
+                self._check_pointless()
+
+                # TODO: Now the new inventory is known, check for conflicts.
+                # ADHB 2006-08-08: If this is done, populate_new_inv should not add
+                # weave lines, because nothing should be recorded until it is known
+                # that commit will succeed.
+                self._set_progress_stage("Saving data locally")
+                self.builder.finish_inventory()
+
+                # Prompt the user for a commit message if none provided
+                message = message_callback(self)
+                assert isinstance(message, unicode), type(message)
+                self.message = message
+                self._escape_commit_message()
+
+                # Add revision data to the local branch
+                self.rev_id = self.builder.commit(self.message)
+            except:
+                # perhaps this should be done by the CommitBuilder ?
+                self.work_tree.branch.repository.abort_write_group()
+                raise
+
             # Upload revision data to the master.
             # this will propagate merged revisions too if needed.
             if self.bound_branch:

=== modified file 'bzrlib/fetch.py'
--- a/bzrlib/fetch.py	2007-06-22 22:19:13 +0000
+++ b/bzrlib/fetch.py	2007-07-20 02:27:13 +0000
@@ -108,7 +108,14 @@
         try:
             self.to_repository.lock_write()
             try:
-                self.__fetch()
+                self.to_repository.start_write_group()
+                try:
+                    self.__fetch()
+                except:
+                    self.to_repository.abort_write_group()
+                    raise
+                else:
+                    self.to_repository.commit_write_group()
             finally:
                 if self.nested_pb is not None:
                     self.nested_pb.finished()

=== modified file 'bzrlib/remote.py'
--- a/bzrlib/remote.py	2007-08-07 07:04:58 +0000
+++ b/bzrlib/remote.py	2007-08-07 22:59:45 +0000
@@ -249,10 +249,28 @@
         self._lock_count = 0
         self._leave_lock = False
 
-    def has_same_location(self, other):
-        return (self.__class__ == other.__class__ and
-                self.bzrdir.transport.base == other.bzrdir.transport.base)
-        
+    def abort_write_group(self):
+        """Complete a write group on the decorated repository.
+        
+        Smart methods peform operations in a single step so this api
+        is not really applicable except as a compatibility thunk
+        for older plugins that don't use e.g. the CommitBuilder
+        facility.
+        """
+        self._ensure_real()
+        return self._real_repository.abort_write_group()
+
+    def commit_write_group(self):
+        """Complete a write group on the decorated repository.
+        
+        Smart methods peform operations in a single step so this api
+        is not really applicable except as a compatibility thunk
+        for older plugins that don't use e.g. the CommitBuilder
+        facility.
+        """
+        self._ensure_real()
+        return self._real_repository.commit_write_group()
+
     def _ensure_real(self):
         """Ensure that there is a _real_repository set.
 
@@ -303,6 +321,10 @@
         assert response[0] in ('yes', 'no'), 'unexpected response code %s' % (response,)
         return response[0] == 'yes'
 
+    def has_same_location(self, other):
+        return (self.__class__ == other.__class__ and
+                self.bzrdir.transport.base == other.bzrdir.transport.base)
+        
     def get_graph(self, other_repository=None):
         """Return the graph for this repository format"""
         return self._real_repository.get_graph(other_repository)
@@ -341,6 +363,17 @@
         """See Repository.get_physical_lock_status()."""
         return False
 
+    def is_in_write_group(self):
+        """Return True if there is an open write group.
+
+        write groups are only applicable locally for the smart server..
+        """
+        if self._real_repository:
+            return self._real_repository.is_in_write_group()
+
+    def is_locked(self):
+        return self._lock_count >= 1
+
     def is_shared(self):
         """See Repository.is_shared()."""
         path = self.bzrdir._path_for_remote_call(self._client)
@@ -412,6 +445,17 @@
         elif self._lock_mode == 'r':
             self._real_repository.lock_read()
 
+    def start_write_group(self):
+        """Start a write group on the decorated repository.
+        
+        Smart methods peform operations in a single step so this api
+        is not really applicable except as a compatibility thunk
+        for older plugins that don't use e.g. the CommitBuilder
+        facility.
+        """
+        self._ensure_real()
+        return self._real_repository.start_write_group()
+
     def _unlock(self, token):
         path = self.bzrdir._path_for_remote_call(self._client)
         response = self._client.call('Repository.unlock', path, token)
@@ -423,6 +467,11 @@
             raise errors.UnexpectedSmartServerResponse(response)
 
     def unlock(self):
+        if self._lock_count == 1 and self._lock_mode == 'w':
+            # don't unlock if inside a write group.
+            if self.is_in_write_group():
+                raise errors.BzrError(
+                    'Must end write groups before releasing write locks.')
         self._lock_count -= 1
         if not self._lock_count:
             mode = self._lock_mode

=== modified file 'bzrlib/repofmt/knitrepo.py'
--- a/bzrlib/repofmt/knitrepo.py	2007-07-25 21:26:30 +0000
+++ b/bzrlib/repofmt/knitrepo.py	2007-08-07 22:59:45 +0000
@@ -266,8 +266,10 @@
         :param revision_id: Optional revision id.
         """
         revision_id = osutils.safe_revision_id(revision_id)
-        return RootCommitBuilder(self, parents, config, timestamp, timezone,
+        result = RootCommitBuilder(self, parents, config, timestamp, timezone,
                                  committer, revprops, revision_id)
+        self.start_write_group()
+        return result
 
 
 class RepositoryFormatKnit(MetaDirRepositoryFormat):

=== modified file 'bzrlib/repository.py'
--- a/bzrlib/repository.py	2007-08-07 07:04:58 +0000
+++ b/bzrlib/repository.py	2007-08-07 22:59:45 +0000
@@ -82,6 +82,30 @@
         r'.*revision="(?P<revision_id>[^"]+)"'
         )
 
+    def abort_write_group(self):
+        """Commit the contents accrued within the current write group.
+
+        :seealso: start_write_group.
+        """
+        if self._write_group is not self.get_transaction():
+            # has an unlock or relock occured ?
+            raise errors.BzrError('mismatched lock context and write group.')
+        self._abort_write_group()
+        self._write_group = None
+
+    def _abort_write_group(self):
+        """Template method for per-repository write group cleanup.
+        
+        This is called during abort before the write group is considered to be 
+        finished and should cleanup any internal state accrued during the write
+        group. There is no requirement that data handed to the repository be
+        *not* made available - this is not a rollback - but neither should any
+        attempt be made to ensure that data added is fully commited. Abort is
+        invoked when an error has occured so futher disk or network operations
+        may not be possible or may error and if possible should not be
+        attempted.
+        """
+
     @needs_write_lock
     def add_inventory(self, revision_id, inv, parents):
         """Add the inventory inv to the repository as revision_id.
@@ -230,6 +254,7 @@
         # TODO: make sure to construct the right store classes, etc, depending
         # on whether escaping is required.
         self._warn_if_deprecated()
+        self._write_group = None
 
     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, 
@@ -247,11 +272,22 @@
         return (self.control_files._transport.base ==
                 other.control_files._transport.base)
 
+    def is_in_write_group(self):
+        """Return True if there is an open write group.
+
+        :seealso: start_write_group.
+        """
+        return self._write_group is not None
+
     def is_locked(self):
         return self.control_files.is_locked()
 
     def lock_write(self, token=None):
         """Lock this repository for writing.
+
+        This causes caching within the repository obejct to start accumlating
+        data during reads, and allows a 'write_group' to be obtained. Write
+        groups must be used for actual data insertion.
         
         :param token: if this is already locked, then lock_write will fail
             unless the token matches the existing lock.
@@ -260,6 +296,7 @@
             instance doesn't support using token locks.
         :raises MismatchedToken: if the specified token doesn't match the token
             of the existing lock.
+        :seealso: start_write_group.
 
         A token should be passed in if you know that you have locked the object
         some other way, and need to synchronise this object's state with that
@@ -267,10 +304,13 @@
 
         XXX: this docstring is duplicated in many places, e.g. lockable_files.py
         """
-        return self.control_files.lock_write(token=token)
+        result = self.control_files.lock_write(token=token)
+        self._refresh_data()
+        return result
 
     def lock_read(self):
         self.control_files.lock_read()
+        self._refresh_data()
 
     def get_physical_lock_status(self):
         return self.control_files.get_physical_lock_status()
@@ -371,6 +411,26 @@
         revision_id = osutils.safe_revision_id(revision_id)
         return InterRepository.get(self, destination).copy_content(revision_id)
 
+    def commit_write_group(self):
+        """Commit the contents accrued within the current write group.
+
+        :seealso: start_write_group.
+        """
+        if self._write_group is not self.get_transaction():
+            # has an unlock or relock occured ?
+            raise errors.BzrError('mismatched lock context and write group.')
+        self._commit_write_group()
+        self._write_group = None
+
+    def _commit_write_group(self):
+        """Template method for per-repository write group cleanup.
+        
+        This is called before the write group is considered to be 
+        finished and should ensure that all data handed to the repository
+        for writing during the write group is safely committed (to the 
+        extent possible considering file system caching etc).
+        """
+
     def fetch(self, source, revision_id=None, pb=None):
         """Fetch the content required to construct revision_id from source.
 
@@ -401,10 +461,17 @@
         :param revision_id: Optional revision id.
         """
         revision_id = osutils.safe_revision_id(revision_id)
-        return _CommitBuilder(self, parents, config, timestamp, timezone,
+        result =_CommitBuilder(self, parents, config, timestamp, timezone,
                               committer, revprops, revision_id)
+        self.start_write_group()
+        return result
 
     def unlock(self):
+        if (self.control_files._lock_count == 1 and
+            self.control_files._lock_mode == 'w'):
+            if self._write_group is not None:
+                raise errors.BzrError(
+                    'Must end write groups before releasing write locks.')
         self.control_files.unlock()
 
     @needs_read_lock
@@ -422,6 +489,35 @@
         self.copy_content_into(dest_repo, revision_id)
         return dest_repo
 
+    def start_write_group(self):
+        """Start a write group in the repository.
+
+        Write groups are used by repositories which do not have a 1:1 mapping
+        between file ids and backend store to manage the insertion of data from
+        both fetch and commit operations.
+
+        A write lock is required around the start_write_group/commit_write_group
+        for the support of lock-requiring repository formats.
+
+        One can only insert data into a repository inside a write group.
+
+        :return: None.
+        """
+        if not self.is_locked() or self.control_files._lock_mode != 'w':
+            raise errors.NotWriteLocked(self)
+        if self._write_group:
+            raise errors.BzrError('already in a write group')
+        self._start_write_group()
+        # so we can detect unlock/relock - the write group is now entered.
+        self._write_group = self.get_transaction()
+
+    def _start_write_group(self):
+        """Template method for per-repository write group startup.
+        
+        This is called before the write group is considered to be 
+        entered.
+        """
+
     @needs_read_lock
     def sprout(self, to_bzrdir, revision_id=None):
         """Create a descendent repository for new development.
@@ -813,6 +909,16 @@
         reconciler.reconcile()
         return reconciler
 
+    def _refresh_data(self):
+        """Helper called from lock_* to ensure coherency with disk.
+
+        The default implementation does nothing; it is however possible
+        for repositories to maintain loaded indices across multiple locks
+        by checking inside their implementation of this method to see
+        whether their indices are still valid. This depends of course on
+        the disk format being validatable in this manner.
+        """
+
     @needs_read_lock
     def revision_tree(self, revision_id):
         """Return Tree for a revision on this branch.
@@ -1059,7 +1165,7 @@
 
     inv = revision_tree.inventory
     entries = inv.iter_entries()
-    # backwards compatability hack: skip the root id.
+    # backwards compatibility hack: skip the root id.
     if not repository.supports_rich_root():
         path, root = entries.next()
         if root.revision != rev.revision_id:
@@ -1938,8 +2044,9 @@
                        revision_id=self._new_revision_id,
                        properties=self._revprops)
         rev.parent_ids = self.parents
-        self.repository.add_revision(self._new_revision_id, rev, 
+        self.repository.add_revision(self._new_revision_id, rev,
             self.new_inventory, self._config)
+        self.repository.commit_write_group()
         return self._new_revision_id
 
     def revision_tree(self):

=== modified file 'bzrlib/tests/branch_implementations/test_branch.py'
--- a/bzrlib/tests/branch_implementations/test_branch.py	2007-07-19 06:34:09 +0000
+++ b/bzrlib/tests/branch_implementations/test_branch.py	2007-08-07 22:59:45 +0000
@@ -367,8 +367,12 @@
         result.report_results(verbose=False)
 
     def test_get_commit_builder(self):
-        self.assertIsInstance(self.make_branch(".").get_commit_builder([]), 
-            repository.CommitBuilder)
+        branch = self.make_branch(".")
+        branch.lock_write()
+        builder = branch.get_commit_builder([])
+        self.assertIsInstance(builder, repository.CommitBuilder)
+        branch.repository.commit_write_group()
+        branch.unlock()
 
     def test_generate_revision_history(self):
         """Create a fake revision history easily."""

=== modified file 'bzrlib/tests/interrepository_implementations/test_interrepository.py'
--- a/bzrlib/tests/interrepository_implementations/test_interrepository.py	2007-03-07 01:54:13 +0000
+++ b/bzrlib/tests/interrepository_implementations/test_interrepository.py	2007-07-17 16:04:00 +0000
@@ -182,6 +182,8 @@
         # this should ensure that the new versions of files are being checked
         # for during pull operations
         inv = source.get_inventory('a')
+        source.lock_write()
+        source.start_write_group()
         inv['id'].revision = 'b'
         inv.revision_id = 'b'
         sha1 = source.add_inventory('b', inv, ['a'])
@@ -193,6 +195,8 @@
                        revision_id='b')
         rev.parent_ids = ['a']
         source.add_revision('b', rev)
+        source.commit_write_group()
+        source.unlock()
         self.assertRaises(errors.RevisionNotPresent, target.fetch, source)
         self.assertFalse(target.has_revision('b'))
 

=== modified file 'bzrlib/tests/repository_implementations/__init__.py'
--- a/bzrlib/tests/repository_implementations/__init__.py	2007-08-07 07:47:27 +0000
+++ b/bzrlib/tests/repository_implementations/__init__.py	2007-08-07 22:59:45 +0000
@@ -108,6 +108,7 @@
         'bzrlib.tests.repository_implementations.test_repository',
         'bzrlib.tests.repository_implementations.test_revision',
         'bzrlib.tests.repository_implementations.test_statistics',
+        'bzrlib.tests.repository_implementations.test_write_group',
         ]
 
     from bzrlib.smart.server import (

=== modified file 'bzrlib/tests/repository_implementations/test_commit_builder.py'
--- a/bzrlib/tests/repository_implementations/test_commit_builder.py	2007-04-11 05:58:16 +0000
+++ b/bzrlib/tests/repository_implementations/test_commit_builder.py	2007-08-01 00:20:37 +0000
@@ -26,9 +26,13 @@
 class TestCommitBuilder(TestCaseWithRepository):
 
     def test_get_commit_builder(self):
-        tree = self.make_branch_and_tree(".")
-        builder = tree.branch.get_commit_builder([])
+        branch = self.make_branch('.')
+        branch.repository.lock_write()
+        builder = branch.repository.get_commit_builder(
+            branch, [], branch.get_config())
         self.assertIsInstance(builder, CommitBuilder)
+        branch.repository.commit_write_group()
+        branch.repository.unlock()
 
     def record_root(self, builder, tree):
         if builder.record_root_entry is True:
@@ -43,38 +47,51 @@
 
     def test_finish_inventory(self):
         tree = self.make_branch_and_tree(".")
-        builder = tree.branch.get_commit_builder([])
-        self.record_root(builder, tree)
-        builder.finish_inventory()
+        tree.lock_write()
+        try:
+            builder = tree.branch.get_commit_builder([])
+            self.record_root(builder, tree)
+            builder.finish_inventory()
+            tree.branch.repository.commit_write_group()
+        finally:
+            tree.unlock()
 
     def test_commit_message(self):
         tree = self.make_branch_and_tree(".")
-        builder = tree.branch.get_commit_builder([])
-        self.record_root(builder, tree)
-        builder.finish_inventory()
-        rev_id = builder.commit('foo bar blah')
+        tree.lock_write()
+        try:
+            builder = tree.branch.get_commit_builder([])
+            self.record_root(builder, tree)
+            builder.finish_inventory()
+            rev_id = builder.commit('foo bar blah')
+        finally:
+            tree.unlock()
         rev = tree.branch.repository.get_revision(rev_id)
         self.assertEqual('foo bar blah', rev.message)
 
     def test_commit_with_revision_id(self):
         tree = self.make_branch_and_tree(".")
-        # use a unicode revision id to test more corner cases.
-        # The repository layer is meant to handle this.
-        revision_id = u'\xc8abc'.encode('utf8')
+        tree.lock_write()
         try:
+            # use a unicode revision id to test more corner cases.
+            # The repository layer is meant to handle this.
+            revision_id = u'\xc8abc'.encode('utf8')
             try:
-                builder = tree.branch.get_commit_builder([],
-                    revision_id=revision_id)
-            except NonAsciiRevisionId:
-                revision_id = 'abc'
-                builder = tree.branch.get_commit_builder([],
-                    revision_id=revision_id)
-        except CannotSetRevisionId:
-            # This format doesn't support supplied revision ids
-            return
-        self.record_root(builder, tree)
-        builder.finish_inventory()
-        self.assertEqual(revision_id, builder.commit('foo bar'))
+                try:
+                    builder = tree.branch.get_commit_builder([],
+                        revision_id=revision_id)
+                except NonAsciiRevisionId:
+                    revision_id = 'abc'
+                    builder = tree.branch.get_commit_builder([],
+                        revision_id=revision_id)
+            except CannotSetRevisionId:
+                # This format doesn't support supplied revision ids
+                return
+            self.record_root(builder, tree)
+            builder.finish_inventory()
+            self.assertEqual(revision_id, builder.commit('foo bar'))
+        finally:
+            tree.unlock()
         self.assertTrue(tree.branch.repository.has_revision(revision_id))
         # the revision id must be set on the inventory when saving it. This
         # does not precisely test that - a repository that wants to can add it
@@ -86,12 +103,12 @@
     def test_commit_without_root(self):
         """This should cause a deprecation warning, not an assertion failure"""
         tree = self.make_branch_and_tree(".")
-        if tree.branch.repository.supports_rich_root():
-            raise tests.TestSkipped('Format requires root')
-        self.build_tree(['foo'])
-        tree.add('foo', 'foo-id')
         tree.lock_write()
         try:
+            if tree.branch.repository.supports_rich_root():
+                raise tests.TestSkipped('Format requires root')
+            self.build_tree(['foo'])
+            tree.add('foo', 'foo-id')
             entry = tree.inventory['foo-id']
             builder = tree.branch.get_commit_builder([])
             self.callDeprecated(['Root entry should be supplied to'
@@ -104,10 +121,14 @@
 
     def test_commit(self):
         tree = self.make_branch_and_tree(".")
-        builder = tree.branch.get_commit_builder([])
-        self.record_root(builder, tree)
-        builder.finish_inventory()
-        rev_id = builder.commit('foo bar')
+        tree.lock_write()
+        try:
+            builder = tree.branch.get_commit_builder([])
+            self.record_root(builder, tree)
+            builder.finish_inventory()
+            rev_id = builder.commit('foo bar')
+        finally:
+            tree.unlock()
         self.assertNotEqual(None, rev_id)
         self.assertTrue(tree.branch.repository.has_revision(rev_id))
         # the revision id must be set on the inventory when saving it. This does not
@@ -117,10 +138,14 @@
 
     def test_revision_tree(self):
         tree = self.make_branch_and_tree(".")
-        builder = tree.branch.get_commit_builder([])
-        self.record_root(builder, tree)
-        builder.finish_inventory()
-        rev_id = builder.commit('foo bar')
+        tree.lock_write()
+        try:
+            builder = tree.branch.get_commit_builder([])
+            self.record_root(builder, tree)
+            builder.finish_inventory()
+            rev_id = builder.commit('foo bar')
+        finally:
+            tree.unlock()
         rev_tree = builder.revision_tree()
         # Just a couple simple tests to ensure that it actually follows
         # the RevisionTree api.

=== modified file 'bzrlib/tests/repository_implementations/test_reconcile.py'
--- a/bzrlib/tests/repository_implementations/test_reconcile.py	2007-07-19 06:34:09 +0000
+++ b/bzrlib/tests/repository_implementations/test_reconcile.py	2007-08-07 22:59:45 +0000
@@ -56,15 +56,21 @@
         t = get_transport(self.get_url())
         # an empty inventory with no revision for testing with.
         repo = self.make_repository('inventory_without_revision')
+        repo.lock_write()
+        repo.start_write_group()
         inv = Inventory(revision_id='missing')
         inv.root.revision = 'missing'
         repo.add_inventory('missing', inv, [])
+        repo.commit_write_group()
+        repo.unlock()
 
         # an empty inventory with no revision for testing with.
         # this is referenced by 'references_missing' to let us test
         # that all the cached data is correctly converted into ghost links
         # and the referenced inventory still cleaned.
         repo = self.make_repository('inventory_without_revision_and_ghost')
+        repo.lock_write()
+        repo.start_write_group()
         repo.add_inventory('missing', inv, [])
         inv = Inventory(revision_id='references_missing')
         inv.root.revision = 'references_missing'
@@ -77,10 +83,14 @@
                        revision_id='references_missing')
         rev.parent_ids = ['missing']
         repo.add_revision('references_missing', rev)
+        repo.commit_write_group()
+        repo.unlock()
 
         # a inventory with no parents and the revision has parents..
         # i.e. a ghost.
         repo = self.make_repository('inventory_one_ghost')
+        repo.lock_write()
+        repo.start_write_group()
         inv = Inventory(revision_id='ghost')
         inv.root.revision = 'ghost'
         sha1 = repo.add_inventory('ghost', inv, [])
@@ -92,12 +102,16 @@
                        revision_id='ghost')
         rev.parent_ids = ['the_ghost']
         repo.add_revision('ghost', rev)
+        repo.commit_write_group()
+        repo.unlock()
          
         # a inventory with a ghost that can be corrected now.
         t.copy_tree('inventory_one_ghost', 'inventory_ghost_present')
         bzrdir_url = self.get_url('inventory_ghost_present')
         bzrdir = bzrlib.bzrdir.BzrDir.open(bzrdir_url)
         repo = bzrdir.open_repository()
+        repo.lock_write()
+        repo.start_write_group()
         inv = Inventory(revision_id='the_ghost')
         inv.root.revision = 'the_ghost'
         sha1 = repo.add_inventory('the_ghost', inv, [])
@@ -109,6 +123,8 @@
                        revision_id='the_ghost')
         rev.parent_ids = []
         repo.add_revision('the_ghost', rev)
+        repo.commit_write_group()
+        repo.unlock()
 
     def checkEmptyReconcile(self, **kwargs):
         """Check a reconcile on an empty repository."""
@@ -281,6 +297,8 @@
 
         # now setup the wrong-first parent case
         repo = tree.branch.repository
+        repo.lock_write()
+        repo.start_write_group()
         inv = Inventory(revision_id='wrong-first-parent')
         inv.root.revision = 'wrong-first-parent'
         sha1 = repo.add_inventory('wrong-first-parent', inv, ['2', '1'])
@@ -292,9 +310,13 @@
                        revision_id='wrong-first-parent')
         rev.parent_ids = ['1', '2']
         repo.add_revision('wrong-first-parent', rev)
+        repo.commit_write_group()
+        repo.unlock()
 
         # now setup the wrong-secondary parent case
         repo = repo_secondary
+        repo.lock_write()
+        repo.start_write_group()
         inv = Inventory(revision_id='wrong-secondary-parent')
         inv.root.revision = 'wrong-secondary-parent'
         sha1 = repo.add_inventory('wrong-secondary-parent', inv, ['1', '3', '2'])
@@ -306,6 +328,8 @@
                        revision_id='wrong-secondary-parent')
         rev.parent_ids = ['1', '2', '3']
         repo.add_revision('wrong-secondary-parent', rev)
+        repo.commit_write_group()
+        repo.unlock()
 
     def test_reconcile_wrong_order(self):
         # a wrong order in primary parents is optionally correctable

=== modified file 'bzrlib/tests/repository_implementations/test_repository.py'
--- a/bzrlib/tests/repository_implementations/test_repository.py	2007-08-07 07:47:27 +0000
+++ b/bzrlib/tests/repository_implementations/test_repository.py	2007-08-07 22:59:45 +0000
@@ -263,7 +263,11 @@
         wt = self.make_branch_and_tree('source')
         wt.commit('A', allow_pointless=True, rev_id='A')
         repo = wt.branch.repository
+        repo.lock_write()
+        repo.start_write_group()
         repo.sign_revision('A', bzrlib.gpg.LoopbackGPGStrategy(None))
+        repo.commit_write_group()
+        repo.unlock()
         old_signature = repo.get_signature_text('A')
         try:
             old_format = bzrdir.BzrDirFormat.get_default_format()
@@ -604,6 +608,8 @@
         # a inventory with no parents and the revision has parents..
         # i.e. a ghost.
         repo = self.make_repository('inventory_with_unnecessary_ghost')
+        repo.lock_write()
+        repo.start_write_group()
         inv = Inventory(revision_id = 'ghost')
         inv.root.revision = 'ghost'
         sha1 = repo.add_inventory('ghost', inv, [])
@@ -630,6 +636,8 @@
         # check its setup usefully
         inv_weave = repo.get_inventory_weave()
         self.assertEqual(['ghost'], inv_weave.get_ancestry(['ghost']))
+        repo.commit_write_group()
+        repo.unlock()
 
     def test_corrupt_revision_access_asserts_if_reported_wrong(self):
         repo_url = self.get_url('inventory_with_unnecessary_ghost')

=== modified file 'bzrlib/tests/workingtree_implementations/test_inv.py'
--- a/bzrlib/tests/workingtree_implementations/test_inv.py	2007-04-02 17:32:40 +0000
+++ b/bzrlib/tests/workingtree_implementations/test_inv.py	2007-07-17 16:04:00 +0000
@@ -80,6 +80,7 @@
             'fileid', 
             self.branch.repository.get_transaction()).get_lines('2')
         self.assertEqual(lines, ['contents of subdir/file\n'])
+        self.wt.branch.repository.commit_write_group()
 
     def test_snapshot_unchanged(self):
         #This tests that a simple commit does not make a new entry for
@@ -94,6 +95,7 @@
         self.assertRaises(errors.RevisionNotPresent,
                           vf.get_lines,
                           '2')
+        self.wt.branch.repository.commit_write_group()
 
     def test_snapshot_merge_identical_different_revid(self):
         # This tests that a commit with two identical parents, one of which has
@@ -114,6 +116,7 @@
                                   {'1':self.file_1, 'other':other_ie},
                                   self.wt, self.builder)
         self.assertEqual(self.file_active.revision, '2')
+        self.wt.branch.repository.commit_write_group()
 
     def test_snapshot_changed(self):
         # This tests that a commit with one different parent results in a new
@@ -124,6 +127,7 @@
                                   self.wt, self.builder)
         # expected outcome - file_1 has a revision id of '2'
         self.assertEqual(self.file_active.revision, '2')
+        self.wt.branch.repository.commit_write_group()
 
 
 class TestApplyInventoryDelta(TestCaseWithWorkingTree):

=== modified file 'doc/developers/repository.txt'
--- a/doc/developers/repository.txt	2007-07-15 07:31:37 +0000
+++ b/doc/developers/repository.txt	2007-08-01 00:20:37 +0000
@@ -299,6 +299,53 @@
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 
+Caching and writeing of data
+============================
+
+Repositories try to provide a consistent view of the data within them
+within a 'lock context'. 
+
+Locks
+-----
+
+Locks come in two flavours - read locks and write locks. Read locks allow
+data to be read from the repository. Write locks allow data to be read and
+signal that you intend to write data at some point. The actual writing of
+data must take place within a Write Group.
+
+Write locks provide a cache of repository data during the period of the
+write lock, and allow write_groups to be acquired. For some repositories
+the presence of a write lock is exclusive to a single client, for others
+which are lock free or use server side locks (e.g.  svn), the write lock
+simply provides the cache context. 
+
+Write Groups
+------------
+
+Write groups are the only allowed means for inserting data into a
+repository.  These are created by ``start_write_group``, and concluded by
+either ``commit_write_group`` or ``abort_write_group``.  A write lock must
+be held on the repository for the entire duration.  At most one write
+group can be active on a repository at a time.
+
+Write groups signal to the repository the window during which data is
+actively being inserted. Several write groups could be committed during a
+single lock.
+
+There is no guarantee that data inserted during a write group will be
+invisible in the repository if the write group is not committed.
+Specifically repositories without atomic insertion facilities will be
+writing data as it is inserted within the write group, and may not be able
+to revert that data - e.g. in the event of a dropped SFTP connection in a
+knit repository, inserted file data will be visible in the repository. Some
+repositories have an atomic insertion facility, and for those
+all-or-nothing will apply.
+
+The precise meaning of a write group is format specific. For instance a
+knit based repository treats the write group methods as dummy calls,
+simply meeting the api that clients will use. A pack based repository will
+open a new pack container at the start of a write group, and rename it
+into place at commit time.
 
 
 ..




More information about the bazaar-commits mailing list