[MERGE] Branch format 6

Mon Feb 12 15:16:36 GMT 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> On 11 Feb 2007, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
>>It is currently slower than format 5 for at least some operations.  It
>>could potentially outperform format 5, but more optimization is required
>>to meet that potential.
> 
> 
> Because of needing to generate the lefthand revision history to compute
> or interpret revision numbers?  It's probably a good tradeoff but we'll
> need to watch that we don't do it too often.

Well, I'm not sure.  I think that generating revision history ought to
be cheap, because we have to read the whole knit index anyhow.

If it helps, we can store both the revision-id and revno of the
last-revision.  That would mean that nearby revisions would be cheap to
locate by revno and far-away ones would more expensive (but cheaper than
they are now).

> This is https://bugs.launchpad.net/bzr/+bug/73752 by the way, and
> deserves a mention in NEWS, including the command line necessary to
> create such a branch.

I'm not sure what to do vis-a-vis the commandline.  It seems strange to
be able to set policy on creation, but not be able to change it later
using the commandline.  Yet that is already the situation with
repository trees.

> I'm not sure 'strict history' really conveys the meaning of this though.
> In that bug john suggests 'preserve_mainline' -- how is that for you?

Pretty vague.  I don't think of "mainline" as meaning the branch-local
view of history.  And in fact, my "Aaron's mergeable stuff" has this
kind of history, even though it's not a mainline branch.

In the context of Bazaar, "mainline" to me means "bzr.dev".  So I think
this is a potentially confusing term to use.

> Or maybe 'disallow_mainline_changes'.  I would like to convey the facts
> that 
> 
>   this option disallows operations that would otherwise be allowed
> 
>   and the operation is anything that changes the mainline revision 
>   history (is there a standard user-visible term for that now?)

So would like "disallow_history_mutation" would be closer to your
meaning?  I would prefer something like "preserve_history_order".

TBH, I don't think we do have good terminology for the branch's view of
lefthand history.

> Rather than proliferating these methods should we just treat them as a
> dictionary of options or settings on the branch?  But then the
> location-related ones do have some individual behaviour...

I think that this is not an area where we want to encourage great
amounts of variation, whereas treating them as properties would do that.
 Having these as public methods make sure that format authors decide how
to supply them, even if they use a dummy implementation.

There's probably room for more code reuse here.  But I would be inclined
to supply all these, especially because failing to do that with most of
them would break API compatability.

>>@@ -554,13 +565,7 @@
>>         result.set_parent(self.bzrdir.root_transport.base)
>>         return result
>> 
>>-    @needs_read_lock
>>-    def copy_content_into(self, destination, revision_id=None):
>>-        """Copy the content of self into destination.
>>-
>>-        revision_id: if not None, the revision history in the new branch will
>>-                     be truncated to end with revision_id.
>>-        """
>>+    def _synchronize_history(self, destination, revision_id):
> 
> 
> Could do with a docstring even though it's private.

Okay.

>>+class BzrBranchFormat6(BzrBranchFormat5):
>>+    """Branch format with last-revision
>>+
>>+    Unlike previous formats, this has no explicit revision history, instead
>>+    the left-parent revision history is used.
>>+    """
> 
> 
>   "instead this just stores the last revision, and the left-hand history
>   leading up to there is the history"
> 
>   "This was introduced in bzr 0.15"

Sure.

>>+    def initialize(self, a_bzrdir):
>>+        """Create a branch of this format in a_bzrdir."""
>>+        return self.open(a_bzrdir, _found=True, )
> 
> 
> You don't have to do it but this really wants to be factored out...

Okay, I tend to agree.

>>+class BzrBranch6(BzrBranch5):
>>+
> 
> 
> I guess subclassing it is the pragmatic way to get this going, though
> it's not strictly correct that this is a kind of BzrBranch5...

To me, this is just programming-by-difference.  A Branch6 is a Branch5,
except that it handles certain operations differently.

Because Branch has a fat interface, there's no practical way to address
the commonalities between formats 5 and 6 except by inheritance.

It would be possible to extract the common functionality into a base
class, but:

- - That would mean three classes involved:
  1. BzrBranch
  2. BzrBranch56Base (new)
  3. BzrBranch6
  This seems like too much.

- - New formats would require different bases, causing base class
proliferation.  How would we manage a BzrBranch7 that was like a
BzrBranch6 except for how it retrieved config data?

- - This way makes it easy to see exactly how 6 differs from 5.

>>+    def _check_history_violation(self, revision_id):
> 
> 
> Better name?  _check_not_in_history()?

To me, that sounds like we want the revision_id to not be in history.

> 
> 
>>+        last_revision = self.last_revision()
>>+        if last_revision is None:
>>+            return
>>+        if last_revision not in self._lefthand_history(revision_id):
>>+            raise errors.StrictHistoryViolation
> 
> 
> Should it have any parameters?

I guess we could include the branch location.  Mentioning revision ids
probably isn't a win.

>>+    def append_revision(self, *revision_ids):
>>+        if len(revision_ids) == 0:
>>+            return
>>+        prev_revision = self.last_revision()
>>+        for revision in self.repository.get_revisions(revision_ids):
>>+            if prev_revision is None:
>>+                assert revision.parent_ids == []
>>+            else:
>>+                assert revision.parent_ids[0] == prev_revision
>>+            prev_revision = revision.revision_id
>>+        self.set_last_revision(revision_ids[-1])
> 
> 
> Given the recent confusion about AssertionErrors raised from
> kind errors it would be better to raise at least a BzrError with a
> proper message in cases where this is asserting.

I don't agree.  If those assertions fail, there is a bug in the calling
code.  (It is trying to append a revision that is not a proper child of
the last-revision).

The kind errors issue was problematic because user actions could cause
an assert to fail.

> 
> 
>   'No location suppled and no previous location known.'

Sure.

>>+class StrictHistoryViolation(BzrError):
>>+
>>+    _fmt = """Applying operation would alter current history"""
>>+
>>+
> 
> 
> We could have a more user-oriented message:
> 
>   Operation denied because it would change the main history, which is
>   not permitted by the strict_history setting on branch %s. 
> 
> so it tells them why this happened...
> 
> (or whatever option name is chosen)

Sure.

>>     def test_set_push_location(self):
>>-        from bzrlib.config import (locations_config_filename,
>>-                                   ensure_config_dir_exists)
>>-        ensure_config_dir_exists()
>>-        fn = locations_config_filename()
>>         branch = self.get_branch()
>>         branch.set_push_location('foo')
>>-        local_path = urlutils.local_path_from_url(branch.base[:-1])
>>-        self.assertFileEqual("[%s]\n"
>>-                             "push_location = foo\n"
>>-                             "push_location:policy = norecurse" % local_path,
>>-                             fn)
>>-
>>-    # TODO RBC 20051029 test getting a push location from a branch in a 
>>-    # recursive section - that is, it appends the branch name.
>>+        self.assertEqual('foo', branch.get_push_location())
> 
> 
> Can you tell me why?  Was this just cleaned up?

Testing the contents of the config file isn't an appropriate way to
ensure that setting the push_location works.

For Branch6, the push location is stored in the branch's branch.conf,
not the user's locations.conf.

So I changed the test to use the branch interface instead.

>>=== modified file 'bzrlib/tests/branch_implementations/test_hooks.py'
>>--- bzrlib/tests/branch_implementations/test_hooks.py	2007-02-10 07:09:41 +0000
>>+++ bzrlib/tests/branch_implementations/test_hooks.py	2007-02-11 20:23:49 +0000
>>@@ -42,7 +42,12 @@
>>             [('set_rh', branch, [], True)])
>> 
>>     def test_set_rh_nonempty_history(self):
>>-        branch = self.make_branch('source')
>>+        tree = self.make_branch_and_memory_tree('source')
>>+        tree.lock_write()
>>+        tree.add('')
> 
> 
> What does that do?

MemoryTrees must be locked in order to work properly.  They must have
the root directory explicitly added for commit to succeed.

Why am I committing?  Because Branch6.set_revision_history requires the
revisions to exist, so that it can guarantee that if
set_revision_history succeeds, get_revision_history will return the
requested output.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFF0ITT0F+nu1YWqI0RAl1VAJ9uxTlKz6r/vUhOSY+4Ssenh7hxcACfQhgw
zSnqToOaiJV8njJk6l48qJE=
=U0Zw
-----END PGP SIGNATURE-----