[RFC][MERGE] bzrdir phase 3

Thu Feb 16 20:45:50 GMT 2006

On Thu, 2006-02-16 at 09:39 -0600, John A Meinel wrote:

> > +
> > +
> > +class TestIsControlFilename(TestCaseWithWorkingTree):
> > +
> > +    def validate_tree_is_controlfilename(self, tree):
> > +        """check that 'tree' obeys the contract for is_control_filename."""
> > +        bzrdirname = basename(tree.bzrdir.transport.base[:-1])
> > +        self.assertTrue(tree.is_control_filename(bzrdirname))
> > +        self.assertTrue(tree.is_control_filename(bzrdirname + '/subdir'))
> > +        self.assertFalse(tree.is_control_filename('dir/' + bzrdirname))
> > +        self.assertFalse(tree.is_control_filename('dir/' + bzrdirname + '/sub'))
> 
> What is the contract for 'is_control_filename', because most likely
> 'dir/.bzr/sub' would be a control file for a different branch. I don't
> think we want an outer branch to be controlling an inner branch's meta
> files.

To add 'dir/.bzr/sub' to this tree, first we have to add 'dir'. If 'dir'
is a tree, then that triggers the 'cant add a subtree' code path.

The contract is that it returns true IF and only IF the file is part of
the control meta data bzr maintains.

> > +
> > +    def test_dotbzr_is_control_in_cwd(self):
> > +        tree = self.make_branch_and_tree('.')
> > +        self.validate_tree_is_controlfilename(tree)
> > +        
> > +    def test_dotbzr_is_control_in_subdir(self):
> > +        tree = self.make_branch_and_tree('subdir')
> > +        self.validate_tree_is_controlfilename(tree)
> > +        
> 
> Why are you checking 'subdir' if you don't have an outer tree?

if `cwd` != 'tree_root', its possible to get different results. I found
a bug by testing that :).

> > 
> > === modified file 'bzrlib/__init__.py'
> > --- bzrlib/__init__.py	
> > +++ bzrlib/__init__.py	
> > @@ -16,7 +16,6 @@
> >  
> >  """bzr library"""
> >  
> > -BZRDIR = ".bzr"
> 
> Is this in preparation for allowing '_bzr'?

Or any other such change yes.

> >  
> >  # please keep these sorted (in C locale order) to aid merging
> >  DEFAULT_IGNORE = [
> 
> Side note:
> When we get the new ignore stuff, I think we should move DEFAULT_IGNORES
> out from bzrlib.DEFAULT_IGNORE.

Agreed.

> > 
> > === modified file 'bzrlib/add.py'
> good
> 
> > === modified file 'bzrlib/branch.py'
> ...
> > -    def _find_modes(self, t):
> > -        """Determine the appropriate modes for files and directories.
> 
> This should be on LockableFiles anyway.
> 
> ...
> > @@ -597,10 +574,8 @@
> >          if not _found:
> >              # we are being called directly and must probe.
> >              raise NotImplementedError
> > -        transport = a_bzrdir.get_branch_transport(self)
> > -        control_files = LockableFiles(transport, 'branch-lock')
> >          return BzrBranch(_format=self,
> > -                         _control_files=control_files,
> > +                         _control_files=a_bzrdir._control_files,
> >                           a_bzrdir=a_bzrdir,
> >                           _repository=a_bzrdir.open_repository())
> 
> Is this the right place to do it? If this section is only for format
> 4,5,6 branches, I'm okay with it. I certainly don't want to deny new
> formats from having separate control files.

That is the open routine for format 4 branches (which are used in format
4,5,6 bzrdirs). The format 5 opener uses a separate control file
('branch/lock').

> >  class cmd_diff(Command):
> > @@ -1451,11 +1450,29 @@
> >      this command. When the default format has changed you may also be warned
> >      during other operations to upgrade.
> >      """
> > +    # NB: this is used from the class without creating an instance, which is
> > +    # why it does not have a self parameter.
> > +    def get_format_type(typestring):
> > +        """Parse and return a format specifier."""
> > +        if typestring == "metadir":
> > +            return bzrdir.BzrDirMetaFormat1
> > +        msg = "No known bzr-dir format %s. Supported types are: metadir\n" %\
> > +            (typestring)
> > +        raise BzrCommandError(msg)
> > +
> 
> Doesn't this need to be declared 'staticmethod'?

No. The comment there should explain it. Its a quirk of the way the
command classes are used; I haven't spent the time digging into it yet
to explain it.

> > === modified file 'bzrlib/bzrdir.py'
> > --- bzrlib/bzrdir.py	
> > +++ bzrlib/bzrdir.py	
> > @@ -21,18 +21,32 @@
> >  """
> >  
> >  from copy import deepcopy
> > +import os
> >  from cStringIO import StringIO
> >  from unittest import TestSuite
> > -
> >  
> >  import bzrlib
> >  import bzrlib.errors as errors
> >  from bzrlib.lockable_files import LockableFiles
> >  from bzrlib.osutils import safe_unicode
> > +from bzrlib.osutils import (
> > +                            abspath,
> > +                            pathjoin,
> > +                            safe_unicode,
> > +                            sha_strings,
> > +                            sha_string,
> > +                            )
> 
> Why do you have the closing ')' line up with the entries, rather than
> lining up with the opening '('. This isn't the only place where occurs
> (as I recall the test name listing does this as well). I think it would
> look better with the () lined up.

In Launchpad the coding style is that such things line up with the
content like I do here, I don't recall if thats PEP8 or just a visual
aid. Regardless of that we can choose our destiny... but I like it
lining up with the indented entries, its less distruptive to the eye
while still being trivial to add lines to.

> > +from bzrlib.store.text import TextStore
> > +from bzrlib.store.weave import WeaveStore
> > +from bzrlib.symbol_versioning import *
> >  from bzrlib.trace import mutter
> > -from bzrlib.symbol_versioning import *
> > +from bzrlib.transactions import PassThroughTransaction
> >  from bzrlib.transport import get_transport
> >  from bzrlib.transport.local import LocalTransport
> > +from bzrlib.weave import Weave
> > +from bzrlib.weavefile import read_weave, write_weave
> > +from bzrlib.xml4 import serializer_v4
> > +from bzrlib.xml5 import serializer_v5
> >  
> >  
> >  class BzrDir(object):
> > @@ -46,6 +60,10 @@
> >      root_transport
> >          a transport connected to the directory this bzr was opened from.
> >      """
> > +
> > +    def can_update_format(self):
> > +        """Return true if this bzrdir is one whose format we can update."""
> > +        return True
> >  
> >      def _check_supported(self, format, allow_unsupported):
> >          """Check whether format is a supported format.
> > @@ -343,6 +361,22 @@
> >          self._format = _format
> >          self.transport = _transport.clone('.bzr')
> >          self.root_transport = _transport
> > +
> > +    def needs_format_update(self, format=None):
> > +        """Return true if this bzrdir needs update_format run on it.
> > +        
> > +        For instance, if the repository format is out of date but the 
> > +        branch and working tree are not, this should return True.
> > +
> > +        :param format: Optional parameter indicating a specific desired
> > +                       format we want to end up at.
> > +        """
> > +        # for now, if the format is not the same as the system default,
> > +        # an upgrade is needed. In the future we will want to scan
> > +        # the individual repository/branch/checkout formats too
> > +        if format is None:
> > +            format = BzrDirFormat.get_default_format().__class__
> > +        return not isinstance(self._format, format)
> 
> The problem is 'BzrFormat6.needs_format_upgrade(BzrFormat4) would return
> True, even though it is 'downgrading' the format.

Thats correct though: if someone wants to change the format to be
BzrFormat4, the will need to run a format converter.

> >  
> >      @staticmethod
> >      def open_unsupported(base):
> > @@ -499,6 +533,12 @@
> >  class BzrDirPreSplitOut(BzrDir):
> >      """A common class for the all-in-one formats."""
> >  
> > +    def __init__(self, _transport, _format):
> > +        """See BzrDir.__init__."""
> > +        super(BzrDirPreSplitOut, self).__init__(_transport, _format)
> > +        self._control_files = LockableFiles(self.get_branch_transport(None),
> > +                                            'branch-lock')
> > +
> >      def clone(self, url, revision_id=None, basis=None, force_new_repo=False):
> >          """See BzrDir.clone()."""
> >          from bzrlib.workingtree import WorkingTreeFormat2
> > @@ -608,6 +648,10 @@
> >          from bzrlib.repository import RepositoryFormat4
> >          return RepositoryFormat4().initialize(self, shared)
> >  
> > +    def needs_format_update(self, format=None):
> > +        """Format 4 dirs are always in need of updating."""
> > +        return True
> > +
> >      def open_repository(self):
> >          """See BzrDir.open_repository."""
> >          from bzrlib.repository import RepositoryFormat4
> > @@ -655,6 +699,10 @@
> >      individual formats are really split out.
> >      """
> >  
> > +    def can_update_format(self):
> > +        """See BzrDir.can_update_format()."""
> > +        return False
> > +
> >      def create_branch(self):
> >          """See BzrDir.create_branch."""
> >          from bzrlib.branch import BranchFormat
> > @@ -711,6 +759,11 @@
> >          except errors.FileExists:
> >              pass
> >          return self.transport.clone('checkout')
> > +
> > +    def needs_format_update(self, format=None):
> > +        """See BzrDir.needs_format_update()."""
> > +        # currently there are no possible updates to meta1 formats.
> > +        return False
> >  
> >      def open_branch(self, unsupported=False):
> >          """See BzrDir.open_branch."""
> > @@ -776,6 +829,20 @@
> >      def get_format_string(self):
> >          """Return the ASCII format string that identifies this format."""
> >          raise NotImplementedError(self.get_format_string)
> > +
> > +    def get_updater(self, format=None):
> > +        """Return the updater to use to convert bzrdirs needing updates.
> > +
> > +        This returns a bzrlib.bzrdir.Converter object.
> > +
> > +        This should return the best upgrader to step this format towards the
> > +        current default format. In the case of plugins we can/shouold provide
> > +        some means for them to extend the range of returnable converters.
> 
> Small typo. And also why I thought the target format should be the one
> who knows how to upgrade.

Its not quite ready for plugins - I didn't want to write the
infrastructure code with no examples, but as I'm not working on
versioned file [well, after this patch gets beaten on], I should have
the use cases to drive it to completion.

Hmm, wiki is offline - the spec is there. So from memory, we essentially
have pairs of changes that need to occur: (FROMFORMAT, TOFORMAT). And if
we gather those all up we can build a graph. Note that its a full graph,
not a DAG, because we have a generic converter 'read everything, write
everything' which allows downgrades, and we should also be able to write
specific reverters if needed. This is only relevant for bleeding edge
stuff of course - but its essential to be able to get back to the
'stable' format if you upgraded too early, or to play around.

So a plugin while it knows enough to convert *from* any given format,
also needs to be able to tell bzr how to convert *to* the mainline
format again. So being owned by either the target or the source is
wrong. So at the moment, I've given it to the source but passed in the
eventual target as well. That allows me to convert the internals from
hard coded results into a conversion graph lookup - and when that is
done plugins will do:

bzrdir.BzrDir.Converter.register_converter(from_format, to_format,
converter_factory)

i.e. for the current bound branch format you might offer a converter
from them to the metadir, using a different branch output to preserve
the bound branch data
Converter.register_converter(BoundBzrDirFormat, BzrMetaFormat1,
ConvertBoundBranchToMetaBoundBranch)

class ConvertBoundBranchToMetaBoundBranch(ConvertFormat6ToMeta):
...

that would output a meta 1 bzr dir, format 7 repo, format 3 working tree
and BoundBranchFormat format branch. 

But importantly: the source format is provided by your plugin, the
target bzrdir format is -not-.

And finally, the reason the api is on the source format is that that is
the order we walk the conceptual graph: SOURCE->OUTPUT, rinse and repeat
until we are done.

> > +
> > +
> > +class Converter(object):
> > +    """Converts a disk format object from one format to another."""
> > +
> > +    def convert(self, to_convert, pb):
> > +        """Perform the conversion of to_convert, giving feedback via pb.
> > +
> > +        :param to_convert: The disk object to convert.
> > +        :param pb: a progress bar to use for progress information.
> > +        """
> > +
> 
> We probably want some sort of NotImplemented() exception here, rather
> than nothing.

Sure thing.

> Ultimately, the lock format is going to change, right? Is this happening
> before we start using Meta-1 format?
> 
> Though I guess that would be a new Branch format as well (since it needs
> to know about the new locking style).

We can start using the Meta-1 format when we decide its time to start
guarantee support for branches in it - we just freeze the format and
bump to a new one internally. I think letting the lock code get in is
good, but I'd like an intermediary format between the weave and the knit
transition. So, we could start using this now I think.

> > @@ -80,7 +106,6 @@
> >  
> >      print
> >      print 'branch history:'
> > -    history = b.revision_history()
> >      revno = len(history)
> >      print '  %8d revision%s' % (revno, plural(revno))
> >      committers = {}
> > @@ -107,7 +132,7 @@
> >      print
> >      print 'revision store:'
> >      c, t = b.repository.revision_store.total_size()
> > -    print '  %8d revisions' % c
> > +    print '  %8d revision%s' % (c, plural(c))
> >      print '  %8d kB' % (t/1024)
> 
> Isn't there supposed to be a special way of doing pluralization for
> internationalization work? (Rather than just hard-coding 's' for
> English). Speaking of which, is there some time when we are planning on
> going through the code and adding _() around strings so that we can do
> i18n work?
> Perhaps that is an 0.999 version. But it probably should be done. (And
> is going to lead to a lot of conflicts).

Agreed, but not for this pastch :). gettext has plural forms yes, and
fortunately I have a handy supply of folk that know all about that just
over in the rosetta developers :)

> >  
> >  
> > 
> > === modified file 'bzrlib/missing.py'
> > --- bzrlib/missing.py	
> > +++ bzrlib/missing.py	
> > @@ -1,7 +1,7 @@
> >  """\
> >  A plugin for displaying what revisions are in 'other' but not in local.
> >  """
> 
> Obviously the comment is incorrect now. But not really your fault.

I'll touch it up.

> > -from bzrlib.ui import ui_factory
> > +import bzrlib.ui as ui
> 
> Definitely, otherwise setting ui_factory will have no effect, since we
> have a locally bound variable.

Yup. Want to guess how I figured to change that :)

> >  
> > -class TestUpgrade(TestCaseWithTransport):
> > +class TestUIFactory(ui.UIFactory):
> > +    """A UI Factory which never captures its output.
> > +    """
> > +
> > +    def note(self, fmt_string, *args, **kwargs):
> > +        """See progress.ProgressBar.note()."""
> > +        print fmt_string % args
> > +
> > +    def progress_bar(self):
> > +        return self
> > +        
> > +    def update(self, message, count, total):
> > +        """See progress.ProgressBar.update()."""
> > +
> > +
> 
> This looks more like debugging code, rather than something we want to
> stay. Is that true?

Nope, its prototype test harness support for tracking the use of
progress bars. Once it stabilises a bit more I'll move it to
bzrlib.tests.

> ...
> 
> > === modified file 'bzrlib/tests/test_ui.py'
> > --- bzrlib/tests/test_ui.py	
> > +++ bzrlib/tests/test_ui.py	
> > @@ -18,8 +18,10 @@
> >  """
> >  
> >  import os
> > +from StringIO import StringIO
> >  import sys
> >  
> > +from bzrlib.progress import TTYProgressBar
> >  from bzrlib.tests import TestCase
> >  from bzrlib.ui import SilentUIFactory
> >  from bzrlib.ui.text import TextUIFactory
> > @@ -50,3 +52,13 @@
> >          #                                   user=u'some\u1234')
> >          #                  , 'bogus')
> >  
> > +
> > +    def test_progress_note(self):
> > +        stderr = StringIO()
> > +        stdout = StringIO()
> > +        ui = TextUIFactory()
> > +        pb = TTYProgressBar(to_file=stderr, to_messages_file=stdout)
> > +        result = pb.note('t')
> > +        self.assertEqual(None, result)
> > +        self.assertEqual("t\n", stdout.getvalue())
> > +        self.assertEqual("\r                                                                               \r", stderr.getvalue())
> > 
> 
> This looks like a syntax error. Which means the code is not being run.

nah, its your terminal or our diff getting confused.. Thats the literal
output from a progress bar.

> ...
> 
> > +    def __init__(self, url, format):
> > +        self.format = format
> > +        self.bzrdir = BzrDir.open_unsupported(url)
> > +        if self.bzrdir.root_transport.is_readonly():
> > +            raise errors.UpgradeReadonly
> > +        self.transport = self.bzrdir.root_transport
> > +        self.convert()
> 
> I generally feel that having an action occur in a constructor is not
> desirable.

That was already like that, so I'm going to plead 'SEP' here. But I
agree in terms of rules of thumbs.

> Overall, it looks good. I'd like a few things addressed. But it looks
> very close to mergeable.

Cool.

Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060217/16ef9169/attachment.pgp