[MERGE] Branch format 6

Martin Pool mbp at canonical.com
Mon Feb 12 08:18:43 GMT 2007


On 11 Feb 2007, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
> Hi all,
> 
> This patch introduces format 6

+1 with some comments and questions.

> 
> Format
> ######
> 
> Format 6 differs from the current format 5 in two ways:
> 
> 1. It stores only the last revision, not the entire revision history
> 2. It stores all policy data in the branch config file
>   - parent location
>   - bound location
>   - push location
> 
> This format is compatible with knit (and possibly metaweave) format
> repositories, but it is not compatible with old clients.
> 
> Performance
> ###########
> 
> Because they don't store revision history, format 6 branches are much
> smaller than format 5 branches.  bzr.dev's branch shrinks from 144k to
> 20K on ext3 (space used, not apparent size).

Well done.

> It is currently slower than format 5 for at least some operations.  It
> could potentially outperform format 5, but more optimization is required
> to meet that potential.

Because of needing to generate the lefthand revision history to compute
or interpret revision numbers?  It's probably a good tradeoff but we'll
need to watch that we don't do it too often.

> Strict history
> ==============
> When a branch has a strict history policy, you can only append to it;
> you cannot change the previous history.  This keeps revnos stable and
> maintains a consistent view of history.  Strict history can be enabled
> by setting "strict_history=True" in .bzr/branch/branch.conf

This is https://bugs.launchpad.net/bzr/+bug/73752 by the way, and
deserves a mention in NEWS, including the command line necessary to
create such a branch.  I'm happy to see it come in, I think it will be
helpful in cases like that bug.

I'm not sure 'strict history' really conveys the meaning of this though.
In that bug john suggests 'preserve_mainline' -- how is that for you?
Or maybe 'disallow_mainline_changes'.  I would like to convey the facts
that 

  this option disallows operations that would otherwise be allowed

  and the operation is anything that changes the mainline revision 
  history (is there a standard user-visible term for that now?)

> Rebind
> ======
> When a branch is unbound, the "bound" config value is set to "False",
> instead of clearing the "bound_location" value.  If "bind" is issued
> with no arguments, the previous bound location is used.
> 
> This is consistent with merge, pull, push, etc, which all remember
> previously-used locations.
> 
> It means that if you have a heavyweight checkout but no access to the
> master branch, you can "unbind", do a series of commits (without
> --local) and finally bind when the branch becomes available again.

+1

> 
> Defaults
> ########
> I propose retaining "knit" as the default repository format, because of
> performance issues, and to avoid watersheds.
> 
> I do believe that the disk format is unlikely to change, and getting it
> merged now will mean it will be broadly supported when it becomes the
> default.
> 

OK

> Aaron

> === modified file 'bzrlib/branch.py'
> --- bzrlib/branch.py	2007-02-10 07:09:40 +0000
> +++ bzrlib/branch.py	2007-02-11 20:47:05 +0000
> @@ -225,7 +225,9 @@
>          try:
>              if last_revision is None:
>                  pb.update('get source history')
> -                last_revision = from_branch.last_revision_info()[1]
> +                last_revision = from_branch.last_revision()
> +                if last_revision is None:
> +                    last_revision = _mod_revision.NULL_REVISION
>              return self.repository.fetch(from_branch.repository,
>                                           revision_id=last_revision,
>                                           pb=nested_pb)
> @@ -242,6 +244,11 @@
>          """
>          return None
>      
> +    def get_old_bound_location(self):
> +        """Return the URL of the branch we used to be bound to
> +        """
> +        raise errors.UpgradeRequired(self.base)
> +
>      def get_commit_builder(self, parents, config=None, timestamp=None, 
>                             timezone=None, committer=None, revprops=None, 
>                             revision_id=None):
> @@ -311,6 +318,10 @@
>          """Older format branches cannot bind or unbind."""
>          raise errors.UpgradeRequired(self.base)
>  
> +    def set_strict_history(self, strict):
> +        """Older format branches cannot have a strict history policy."""
> +        raise errors.UpgradeRequired(self.base)
> +
>      def last_revision(self):
>          """Return last revision id, or None"""
>          ph = self.revision_history()

Rather than proliferating these methods should we just treat them as a
dictionary of options or settings on the branch?  But then the
location-related ones do have some individual behaviour...

> @@ -554,13 +565,7 @@
>          result.set_parent(self.bzrdir.root_transport.base)
>          return result
>  
> -    @needs_read_lock
> -    def copy_content_into(self, destination, revision_id=None):
> -        """Copy the content of self into destination.
> -
> -        revision_id: if not None, the revision history in the new branch will
> -                     be truncated to end with revision_id.
> -        """
> +    def _synchronize_history(self, destination, revision_id):

Could do with a docstring even though it's private.

>          new_history = self.revision_history()
>          if revision_id is not None:
>              try:
> @@ -569,6 +574,15 @@
>                  rev = self.repository.get_revision(revision_id)
>                  new_history = rev.get_history(self.repository)[1:]
>          destination.set_revision_history(new_history)
> +
> +    @needs_read_lock
> +    def copy_content_into(self, destination, revision_id=None):
> +        """Copy the content of self into destination.
> +
> +        revision_id: if not None, the revision history in the new branch will
> +                     be truncated to end with revision_id.
> +        """
> +        self._synchronize_history(destination, revision_id)
>          try:
>              parent = self.get_parent()
>          except errors.InaccessibleParent, e:
> @@ -617,6 +631,7 @@
>              format.repository_format = repository.RepositoryFormat7()
>          else:
>              format = self.repository.bzrdir.cloning_metadir()
> +            format.branch_format = self._format
>          return format
>  
>      def create_checkout(self, to_location, revision_id=None,
> @@ -920,6 +935,59 @@
>          return "Bazaar-NG Metadir branch format 5"
>  
>  
> +class BzrBranchFormat6(BzrBranchFormat5):
> +    """Branch format with last-revision
> +
> +    Unlike previous formats, this has no explicit revision history, instead
> +    the left-parent revision history is used.
> +    """

  "instead this just stores the last revision, and the left-hand history
  leading up to there is the history"

  "This was introduced in bzr 0.15"

> +
> +    def get_format_string(self):
> +        """See BranchFormat.get_format_string()."""
> +        return "Bazaar-NG branch format 6\n"
> +
> +    def get_format_description(self):
> +        """See BranchFormat.get_format_description()."""
> +        return "Branch format 6"
> +
> +    def initialize(self, a_bzrdir):
> +        """Create a branch of this format in a_bzrdir."""
> +        mutter('creating branch %r in %s', self, a_bzrdir.transport.base)
> +        branch_transport = a_bzrdir.get_branch_transport(self)
> +        utf8_files = [('last-revision', 'null:\n'),
> +                      ('branch-name', ''),
> +                      ('branch.conf', '')
> +                      ]
> +        control_files = lockable_files.LockableFiles(branch_transport, 'lock',
> +                                                     lockdir.LockDir)
> +        control_files.create_lock()
> +        control_files.lock_write()
> +        control_files.put_utf8('format', self.get_format_string())
> +        try:
> +            for file, content in utf8_files:
> +                control_files.put_utf8(file, content)
> +        finally:
> +            control_files.unlock()
> +        return self.open(a_bzrdir, _found=True, )

You don't have to do it but this really wants to be factored out...

> @@ -1395,6 +1476,13 @@
>                          "use bzrlib.urlutils.escape")
>                      
>              url = urlutils.relative_url(self.base, url)
> +        self._set_parent_location(url)
> +
> +    def _set_parent_location(self, url):
> +        if url is None:
> +            self.control_files._transport.delete('parent')
> +        else:
> +            assert isinstance(url, str)
>              self.control_files.put('parent', StringIO(url + '\n'))
>  
>      @deprecated_function(zero_nine)
> @@ -1574,6 +1662,134 @@
>          return None
>  
>  
> +class BzrBranch6(BzrBranch5):
> +

I guess subclassing it is the pragmatic way to get this going, though
it's not strictly correct that this is a kind of BzrBranch5...

> +    @needs_read_lock
> +    def last_revision(self):
> +        """Return last revision id, or None"""
> +        revision_id = self.control_files.get_utf8('last-revision').read()
> +        revision_id = revision_id.rstrip('\n')
> +        if revision_id == _mod_revision.NULL_REVISION:
> +            revision_id = None
> +        return revision_id
> +
> +    @needs_write_lock
> +    def set_last_revision(self, revision_id):
> +        if self._get_strict_history():
> +            self._check_history_violation(revision_id)
> +        if revision_id is None:
> +            revision_id = 'null:'
> +        self.control_files.put_utf8('last-revision', revision_id + '\n')
> +
> +    def _check_history_violation(self, revision_id):

Better name?  _check_not_in_history()?

> +        last_revision = self.last_revision()
> +        if last_revision is None:
> +            return
> +        if last_revision not in self._lefthand_history(revision_id):
> +            raise errors.StrictHistoryViolation

Should it have any parameters?

> +
> +    @needs_read_lock
> +    def revision_history(self):
> +        """Generate the revision history from last revision
> +        """
> +        return self._lefthand_history(self.last_revision())
> +
> +    @needs_write_lock
> +    def set_revision_history(self, history):
> +        """Set the last_revision, not revision history"""
> +        if len(history) == 0:
> +            self.set_last_revision('null:')
> +        else:
> +            assert history == self._lefthand_history(history[-1])
> +            self.set_last_revision(history[-1])
> +        for hook in Branch.hooks['set_rh']:
> +            hook(self, history)
> +
> +    @needs_write_lock
> +    def append_revision(self, *revision_ids):
> +        if len(revision_ids) == 0:
> +            return
> +        prev_revision = self.last_revision()
> +        for revision in self.repository.get_revisions(revision_ids):
> +            if prev_revision is None:
> +                assert revision.parent_ids == []
> +            else:
> +                assert revision.parent_ids[0] == prev_revision
> +            prev_revision = revision.revision_id
> +        self.set_last_revision(revision_ids[-1])

Given the recent confusion about AssertionErrors raised from
kind errors it would be better to raise at least a BzrError with a
proper message in cases where this is asserting.

> === modified file 'bzrlib/builtins.py'
> --- bzrlib/builtins.py	2007-02-10 07:09:41 +0000
> +++ bzrlib/builtins.py	2007-02-11 20:22:39 +0000
> @@ -2914,11 +2914,21 @@
>      See "help checkouts" for more information on checkouts.
>      """
>  
> -    takes_args = ['location']
> +    takes_args = ['location?']
>      takes_options = []
>  
>      def run(self, location=None):
>          b, relpath = Branch.open_containing(u'.')
> +        if location is None:
> +            try:
> +                location = b.get_old_bound_location()
> +            except errors.UpgradeRequired:
> +                raise errors.BzrCommandError('No location supplied.  '
> +                    'This format does not remember old locations.')
> +            else:
> +                if location is None:
> +                    raise errors.BzrCommandError('No location supplied.  '
> +                        'No previous location known')

  'No location suppled and no previous location known.'

> === modified file 'bzrlib/errors.py'
> --- bzrlib/errors.py	2007-02-04 15:03:01 +0000
> +++ bzrlib/errors.py	2007-02-11 20:17:51 +0000
> @@ -803,6 +803,11 @@
>      _fmt = "%(branch)s is missing %(object_type)s {%(object_id)s}"
>  
>  
> +class StrictHistoryViolation(BzrError):
> +
> +    _fmt = """Applying operation would alter current history"""
> +
> +

We could have a more user-oriented message:

  Operation denied because it would change the main history, which is
  not permitted by the strict_history setting on branch %s. 

so it tells them why this happened...

(or whatever option name is chosen)

>  class DivergedBranches(BzrError):
>      
>      _fmt = "These branches have diverged.  Use the merge command to reconcile them."""
> 

> @@ -577,20 +596,9 @@
>          self.assertEqual("foo", self.get_branch().get_push_location())
>  
>      def test_set_push_location(self):
> -        from bzrlib.config import (locations_config_filename,
> -                                   ensure_config_dir_exists)
> -        ensure_config_dir_exists()
> -        fn = locations_config_filename()
>          branch = self.get_branch()
>          branch.set_push_location('foo')
> -        local_path = urlutils.local_path_from_url(branch.base[:-1])
> -        self.assertFileEqual("[%s]\n"
> -                             "push_location = foo\n"
> -                             "push_location:policy = norecurse" % local_path,
> -                             fn)
> -
> -    # TODO RBC 20051029 test getting a push location from a branch in a 
> -    # recursive section - that is, it appends the branch name.
> +        self.assertEqual('foo', branch.get_push_location())

Can you tell me why?  Was this just cleaned up?

> === modified file 'bzrlib/tests/branch_implementations/test_hooks.py'
> --- bzrlib/tests/branch_implementations/test_hooks.py	2007-02-10 07:09:41 +0000
> +++ bzrlib/tests/branch_implementations/test_hooks.py	2007-02-11 20:23:49 +0000
> @@ -42,7 +42,12 @@
>              [('set_rh', branch, [], True)])
>  
>      def test_set_rh_nonempty_history(self):
> -        branch = self.make_branch('source')
> +        tree = self.make_branch_and_memory_tree('source')
> +        tree.lock_write()
> +        tree.add('')

What does that do?

> +        tree.commit('empty commit', rev_id='foo')
> +        tree.unlock()
> +        branch = tree.branch
>          Branch.hooks.install_hook('set_rh', self.capture_set_rh_hook)
>          branch.set_revision_history([u'foo'])
>          self.assertEqual(self.hook_calls,

-- 
Martin



More information about the bazaar mailing list