[BUG] bzr locks itself when pushing from branch to a checkout in the same shared repo

Martin Pool mbp at canonical.com
Tue Feb 6 00:21:29 GMT 2007


On  5 Feb 2007, John Arbash Meinel <john at arbash-meinel.com> wrote:

> The other problem (IMO) is that a_branch.bzrdir.open_branch() does not
> return a_branch, it returns a new branch pointing to the same location.
> We actually make use of that in some of the test suite, as an "easy" way
> to make sure that if someone else has locked a branch that we fail if we
> try to lock it.

I think it is pretty useful to be able to wrestle with yourself in the
test suite, even if it's not the default behaviour.

> In my cvsps-import plugin, I use a repository shared across branches,
> and I wanted to maintain a repository lock so that things stay cached in
> memory (rather than having to re-read the index files all the time). To
> do it, I hacked in and did:
> 
>  a_branch = Branch.open(path)
>  assert a_branch.repository.bzrdir.transport.base == \
>          repo.bzrdir.transport.base
>  a_branch.repository = repo
> 
> It is a bit of an ugly hack, but it works, and is currently the only way
> to get 2 branches to use the same repository object.

Another thing worth mentioning is that reusing an object will probably
be faster than opening it again, discovering the format, etc.

> At one point this wasn't a huge problem on Linux because we used OS
> level locks, and linux lets the same process lock a file 2 times.
> Windows does not, which manifested itself in "bzr merge ." which was
> part of our test suite at the time. (I think we now have an explict
> check if other.base == this.base to handle something like this).

Of course using or simulating those locks still has the problem that you
can have two objects representing the same file: you just don't know it.

> Another problem, though, is that because of symlinks, normalization,
> alternate transports, etc it is possible to have the same repository
> open over different connections such that the path does not look the
> same. Do we care if someone does:
> 
> cd /path/repo
> bzr checkout sftp://localhost/path/repo/bar foo
> cd foo
> bzr commit -m "foo"
> 
> It isn't as obviously a problem as when they are both local paths.

Well it's not so bad, but clearly if we can avoid the problem then we
should.

> One possibility would be that if we see that the target repository is
> locked, we look at the other repository lock that we hold, and see if
> the unique id is the same.
> 
> But this would require a lot of communication between parts of the code
> base. Like is_locked would have to actually raise the exception, and
> have 'checkout' catch it, and compare with other information and try
> again, etc. (A fairly large amount of API violations/peaking would need
> to happen).

You might be able to do it reasonably if the lock was acquired at open()
time:

 * if not locked, take the lock
 * if locked and the holder is in my process, look up the id in a table 
   and return the existing object refering to it
 * otherwise, block or fail

Obviously that would be a bit of a change from our current
arrangement of taking locks only when they are needed, but perhaps
simpler (or perhaps not).  This would only affect write locks.

> So the simplest thing to do is to just special case the commit code such
> that if a branch has a master branch it checks if the repositories are
> at the same path, and if so, doesn't try to fetch. (Either it could
> overwrite the target's branch.repository member, or do some other trickery).

I'd rather not have that inline in commit.  How about something like
this:

Add a method which takes a repository and url.  Normally it opens the
repository at the url, unless it's the same as the already-open
repository in which case it returns it.  This can be done by comparing
the urls, and if that fails by looking at the lock id in the existing
instance and seeing if it's the same as the one already there.  (So this
method does better when called with the repository already locked.)

Rather than a new method this could be just a new parameter.

Maybe that would be a bit complex to integrate with code that searches
for repositories?

-- 
Martin



More information about the bazaar mailing list