HPSS: Client-side objects

Andrew Bennetts andrew at canonical.com
Tue Apr 17 10:00:51 BST 2007


Robert Collins wrote:
> On Tue, 2007-04-17 at 13:08 +1000, Andrew Bennetts wrote:
> > This subset of the full hpss diff adds RemoteBranch, RemoteRepository,
> > RemoteBzrDir, and some tests for those, and some other small changes.
> 
> Perhaps 'remote' should be a package with 
> remote/bzrdir.py
> remote/branch.py
> remote/repository.py

Perhaps.  I'm comfortable with leaving it as is for now.

> +# TODO: At some point, handle upgrades by just passing the
> +# whole request across to run on the server.
> 
> Perhaps this should be a bug?

Done.  https://bugs.launchpad.net/bzr/+bug/107173

> +class RemoteBzrDir(BzrDir):
> +    """Control directory on a remote server, accessed by HPSS."""
> 
> perhaps
>     """Control directory on a remote server accessed over bzr:."""

Yeah, "HPSS" is rather opaque.  However, your suggestion looks like a typo (even
though it isn't), and is slightly inaccurate.  I've changed it to this:

    """Control directory on a remote server, accessed via bzr:// or similar."""

> +        # XXX: We should go into find_format, but not allow it to find
> +        # RemoteBzrDirFormat and make sure it finds the real underlying
> format.
> +        self._real_bzrdir = None
> 
> I dont get this XXX - could you explain it a little more, or file a bug
> for it ?

Neither do I.  gannotate blames me, but I have no idea what I had in mind.

So I've simply deleted this XXX.  If the alleged problem is ever rediscovered,
then the rediscoverer can write a proper bug report for it :)

> +        #self._real_bzrdir._format.probe_transport(transport)
> 
> This comment can be removed
> 
> +        response = self._client.call('probe_dont_use', path)
> +        if response == ('no',):
> +            raise errors.NotBranchError(path=transport.base)
> 
> The probe_dont_use verb name should be changed. It seems like its a
> useful API to keep. Perhaps 'BzrDir.open' is the right name.

Done.  (And I've renamed and moved the relevant server request handler
appropriately.)

> +        if response == ('no',):
> +            raise errors.NotBranchError(path=transport.base)
> 
> This is clearly faulty in the case of e.g. 'barfoo' coming back as the
> response.

Added this check:

        if response not in [('yes',), ('no',)]:
            raise errors.UnexpectedSmartServerResponse(response)

> +    def _ensure_real(self):
> +        """Ensure that there is a _real_bzrdir set.
> +
> +        used before calls to self._real_bzrdir.
> +        """
> 
> Perhaps the second line could start with 'U' :) (And elsewhere in
> remote.py)

Fixed.

> +        if not self._real_bzrdir:
> +            default_format = BzrDirFormat.get_default_format()
> +            self._real_bzrdir =
> default_format.open(self.root_transport,
> +                _found=True)
> 
> Why does this get a format object rather than calling
> 'BzrDir.open_on_transport' ?

Because calling BzrDir.open_from_transport here causes infinite recursion.

I've added this XXX to explain it:

            # XXX: We can't use BzrDir.open_from_transport here because it
            # causes infinite recursion, so just try opening the bzrdir with the
            # default format.

> +    def create_repository(self, shared=False):
[...]
> 
> These should have an '_ensure_real' guard. Perhaps _ensure_real() should
> return the thing, then you could do:
> return self._ensure_real().get_workingtree_transport(workingtree_format)

That would be cute.  Perhaps too cute :)

Anyway, I've fixed the missing calls to _ensure_real.

> +    For the moment everything is delegated to IO-like operations over
> +    the transport.
> +    """
> 
> 'perhaps most things' rather than 'everything'. Or better 'For the momst
> most operations are performed using local transport-backed Repository
> objects.'

Done.

> +    def get_revision_graph(self, revision_id=None):
> The body of this method strongly reminds me of the protocol cleanup we
> need. I think we should probably do that asap before merging the hpss
> client side objects - it will make the cleanup I would be asking for
> easier. (Which is - dont use assertions, catch errors correctly,
> re-raise appropriate errors locally).

I fixed one assert to raise UnexpectedSmartServerResponse instead.

I'm not sure what you mean by "catch errors correctly, re-raise appropriate
errors locally" — I guess you mean for there to be common error handling
infrastructure on client side?

> +            # FIXME
> fix WHAT?

I have no idea.  gannotate blames Wouter and you :)

I'll just delete that FIXME.  The code looks ok to me (it'll give a rather
unhelpful exception if the data from the server is in the wrong format, but in
general invalid data can always cause confusing problems).

> +    def lock_read(self):
> +        # wrong eventually - want a local lock cache context
> 
> ^ This should be clarified. Specifically it should either create its own
> cache context as Repository does, or perhaps reuse the one of its
> _ensure_real'd instance.

I don't know what a "cache context" is.  Something to do with
VersionedFile.clear_cache?  grep isn't really helping me much here.

> +            if mode != 'w':
> +                return
> This could use a comment - that only write lock remote repositories need
> a method call to perform the unlock.

Done.

> +        # XXX: This assumes that the branch control directory
> is .bzr/branch,
> +        # which isn't necessarily true.
> +        LockableFiles.__init__(
> +            self, bzrdir.root_transport.clone('.bzr/branch'),
> +            'lock', lockdir.LockDir)
> 
> This should be fixed. Its easy: use the get_branch_transport method.

Done.  (I assume passing None to get_branch_transport is ok?)

> +        #branch.Branch.__init__(self)
> 
> Delete it or explain it :)

Replaced with:

        # We intentionally don't call the parent class's __init__, because it
        # will try to assign to self.tags, which is a property in this subclass.
        # And the parent's __init__ doesn't do much anyway.

> the unlock() method on both RemoteBranch and RemoteRepository should
> probably discard the _real object when the final unlock occurs; perhaps
> this is premature though. Just a randomish thought.

Hmm, perhaps.  Something to think about later.

> +    def test_open_remote_branch(self):
> +        # open a standalone branch in the working directory
> +        b = remote.RemoteBzrDir(self.transport)
> +        branch = b.open_branch()
> 
> Shouldn't branch be checked for something? e.g. assertIsInstance?

Done.

> +        # XXX: This seems more like a UI function than something that
> really
> +        # belongs in this class.
> 
> This should be a bug, not a XXX.

Done.

> +# We can't use register_control_format because it adds it at a lower
> priority
> +# than the existing branches, whereas this should take priority.
> +BzrDirFormat._control_formats.insert(0, RemoteBzrDirFormat)
> 
> Lets extend register_control_format.

As discussed on IRC, made a "register_control_protocol" method and used that
instead:

    @classmethod
    def register_control_server_format(klass, format):
        """Register a control format for client-server environments.

        These formats will be tried before ones registered with
        register_control_format.  This gives implementations that decide to the
        chance to grab it before anything looks at the contents of the format
        file.
        """
        klass._control_formats.insert(0, format)

> +        if isinstance(to_repository, RemoteRepository):
> +            to_repository._ensure_real()
> +            to_repository = to_repository._real_repository
> +        if isinstance(from_repository, RemoteRepository):
> +            from_repository._ensure_real()
> +            from_repository = from_repository._real_repository
> 
> ^ this is bogus I think. Really it should be done by an
> InterRemoteRepository which chooses to delegate to the real
> repositories.

Done.

> With everything I mention addressed, I'm +1 on this - but all the remote
> methods need their error handling cleaned up - which is why I think that
> should be done first.

Can you be more specific about this?  Do you just want common logic for
translating error responses from the server?

-Andrew.




More information about the bazaar mailing list