[MERGE] Add an optional 'token' keyword argument to LockableFiles.lock_write

Fri Apr 13 01:51:41 BST 2007

Hi John,

I've finally got around to replying to your comments on the original bundle
(which are still relevant to the newer one).

John Arbash Meinel wrote:
> Andrew Bennetts wrote:
> > John Arbash Meinel wrote:
> > [...]
> >>> I've added this paragraph to the docstring:
> >>>
> >>>         A token should be passed in if you know that you have locked the object
> >>>         some other way, and need to synchronise this object's state with that
> >>>         fact.  For instance, this could happen when accessing the same branch
> >>>         over bzr+ssh:// and then falling back to do some operations on the same
> >>>         branch via sftp://.
> >>>          
> >>
> >> hmm.. falling back to sftp:// after bzr+ssh:// seems a bit weird. (As in
> >> you aren't ever going to do that). But I could be wrong.
> > 
> > That's true.  But the smart server branch does something very similiar; it falls
> > back to using a branch object backed on to a vfs transport (i.e. the 0.11
> > "smart" protocol) if the RemoteBranch doesn't have a smart protocol
> > implementation for that operation.  Ditto RemoteRepositories.
> 
> I certainly understand :). But I'm just pointing out that it is trying
> to use a fairly contrived example. It may be the easiest way to explain
> it...

I've cut it back to this:

        A token should be passed in if you know that you have locked the object
        some other way, and need to synchronise this object's state with that
        fact.

i.e. I've removed the contrived example, but left the rest of the explanation.
It's a bit abstract, but most callers don't need to worry about it so it's
probably ok.  If you don't understand a feature, you probably didn't want to use
it ;)

What do you think?

> >> One other use-case, though. Is that you might have a repository open,
> >> and you are opening a new one. And you don't *know* whether it is the
> >> same repository or not. (You have a heavy checkout and are opening the
> >> master branch, which may or may not be on the same filesystem, or might
> >> be a loopback over sftp/bzr+ssh onto the same filesystem).
> >>
> >> So you *might* be passing a token as "I might already have this open,
> >> but I might not". Is there a reasonable way to do that?
[...]
> 
> So does Branch.lock_write() return a potential token for the underlying
> Repository.lock_write() call? From what I could see of the code, I would
> need to do:

No, but you can do:

    branch_token = branch.lock_write()
    repo_token = branch.repository.lock_write()  # fast, the object is already locked.
    branch.repository.unlock()  # because we didn't want to increment the lock
                                # count, just find out the repo_token.

There are a few hpss tests that need to do this.  We could add a new method to
Branch ("lock_write_tokens"?) if this occurs much in actual code.

> def commit(self, a_branch):
> 
[...snip ugly code]
> 
> 
> Which means that your proposed api *can* be used to support heavyweight
> checkouts which are sharing a repository. But it is kind of .... ugly?
> 
> Anyway, you aren't focused on my use case, you care more about HPSS. I
> just wanted to mention a use case for lock tokens, so we can think about
> them specifically.

Right.

I think for now it's good to know it's possible (which shows that the
underlying concept is sound and flexible enough), and as we get use cases for
more than HPSS we can look at adding more APIs to make these other use cases
easier.  So I'm happy with what we have at the moment.

> > The current API is "arbitrary string that uniquely identifies the lock".  The
> > future API would be a "string that encodes the lock info (e.g. the raw bytes
> > of the lock/info file) that uniquely identifies the lock."  So code written
> > with today's token API will still work just fine with the new one, so long as
> > the new one preserves the properties that 
> >    1) "type(token) is str" is always true (or "token is None" for objects that
> >       don't support lock tokens), and
> >    2) "token_a == token_b" iff they are referring to the same lock.
> > 
> > This is why I say this improvement doesn't need to happen before this can land.
> > 
> 
> Well, it seems like it might be nice to have it be something other than
> "arbitrary string". Which is why it should be discussed. (I guess you
> can argue that it has to be an arbitrary string so it can be passed to
> the smart server?)

To answer your parenthesised question: Well, it has to be serialisable.  And if
the API is initially just "it is an arbitrary string" then that means we don't
need to change the wire protocol when we decide to encode more information in
that string (although a small amount of care will need to be taken to make sure
that clients/servers that can use richer tokens still cope with these nonce
tokens).

There's likely to be a watershed change to the protocol coming up fairly soon
too, so we could also use that opportunity to start requiring a richer token.

I'm completely for more useful tokens, and I think it should be a high priority.
I just want to defer it until the rest of the hpss stuff lands.  hpss with nonce
tokens is much more useful than no hpss.

> > That said, I would definitely like to do this further work in time for 0.16,
> > which would mean no released bzr would have just arbitrary lock tokens w/o debug
> > info, but at the moment I want to focus on getting the smart server landed
> > before working on "nice to have"s like this.
> > 
> > -Andrew.
> 
> Oh, and the other big api compatibility issue is if you have a smart
> server which is returning nonce lock tokens, and a local client that is
> expecting debug-info lock tokens. But as long as the client is only
> passing them back to the server, I guess it doesn't matter.
> 
> (It would specifically matter for the case of switching from bzr+ssh://
> to sftp://, but as that isn't likely to ever happen...)

Right.  I'm not particularly worried by this.

-Andrew.