[MERGE] Makeing WorkingTree nicer to derive from.

Sun Jul 16 01:41:52 BST 2006

On Sat, 2006-07-15 at 10:08 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Robert Collins wrote:
> 
> ...
> 
> > 
> >>>  class WorkingTree3(WorkingTree):
> >>>      """This is the Format 3 working tree.
> >>>  
> >>> @@ -1542,6 +1541,15 @@
> >>>              raise ConflictFormatError()
> >>>          return ConflictList.from_stanzas(RioReader(confile))
> >>>  
> >>> +    def unlock(self):
> >>> +        if self._hashcache.needs_write and self._control_files._lock_count==1:
> >>> +            self._hashcache.write()
> >>> +        # reverse order of locking.
> >>> +        try:
> >>> +            return self._control_files.unlock()
> >>> +        finally:
> >>> +            self.branch.unlock()
> >>> +
> >> I like this refactoring, as it does make things cleaner. The only part
> >> here, is that 'self._hashcache.write()' needs a try/except around it, in
> >> the case that we have a readonly working tree.
> > 
> > I'm happy to add a TODO to that, but its certainly not in the pipeline
> > for what I'm working on - which is dirstate. dirstate will remove the
> > separate hashcache anyway, so there is no motivation to fix that as part
> > of this branch.
> 
> Sure. I'm wondering if we want to completely merge the hashcache into
> the dirstate. It seems like we will want to avoid rewriting it as often
> as we might update the hashcache.
> 
> Although... We probably could make the stat field fixed size. And the
> sha field already is fixed. Since the revision parents don't change
> (unless you do a merge), we probably could actually replace things
> inline, rather that rewriting the whole file....

I thought the plan was to compare the hash cache data as binary blobs,
so they will be fixed size after either uuencoding or hexlifying, and
thus updatable in place.

> > ..
> >> Rather than putting a try/except around self._hashcache.write() twice,
> >> it might be better to have a helper function in the base working tree.
> > 
> > or perhaps write() should fail softly.
> 
> I wasn't sure where the logic that a failure is okay should be. I kind
> of prefer having WorkingTree know about the failure, but just ignore it.
> 
> I suppose we could have hashcache.write() return success/failure.

That would work for me.

> You didn't really give an explanation why you changed lock.py. Is it
> just so that you can prevent multiple locks of the same object? We've
> allowed that for quite some time.
> And there were a few tests that depended on it. (Things that would do
> 'bzr merge -r revid:foo .')
> I would kind of like to support it. Because it can be handy when 'foo'
> is not in your ancestry. (I've used it after uncommit/bundles/etc)

There's nothing in the test suite that is broken by my change to
lock.py. The change to lock.py makes it do the following:
 * shared readlocks of foo.
 * writelocks of foo are exclusive.

Which is the same as we get with the lockdir formats. 'bzr merge -r
revid:foo .' still works just fine.

Supporting multiple writelocks of the same path with different python
objects is incredibly dangerous - lock instance 1, lock instance 2, do a
pull in 1, do a commit in 2 and watch the repository get corrupted.

> ...
> 
> I also would not like to lose testing format 2 and 3 explicit locking
> sequence. I do think it is important to lock in the correct way. I agree
> that we don't need every implementation to lock in this exact way. Your
> tests are at a level that they can't detect locking order. Which means
> one implementation might change order, and cause deadlocks with
> something else.

I've thought about this, and your tests dont prevent deadlocks, and mine
dont increase the risk about deadlocks. [dont forget, our new locks have
timeouts too].

Firstly, both our tests only test one code base. They dont test code
base A locking a tree and branch, and code base B also locking a tree
and branch. 

So ALL the code our tests encounter locks in the same order - TB or BT,
for any given tree format and lock mode. We can have different orders if
one instance is locking for read and one for write though.

Now, what deadlock conditions do we have? We CANNOT have two lockers
waiting on each others locks of the same mode. [see above]. Can we have
them deadlocking on a readlock vs write lock ? Yes - if the branch is
locked first in one, and the tree in the other mode: (imagine this is
the underlying call sequence-
wt_a.branch.lock_write()
wt_b.lock_read()
wt_b.branch.lock_read()
wt_a.lock_write()
This will deadlock as the last two calls will not return, or will
timeout on timing-out-locks.

I note that your tests implicitly covered this because they tested each
path seperately to the same standard. But its the only deadlock I can
conceive of, and its possible because there is no test for it
specifically, to reintroduce it even with your tests - by changing one
test but not the other.

But all of that considered, I'd much rather make our os locks timeout
rather than stopping cold, and if we do that deadlocking stops being an
issue because the thread will return a failure rather than deadlocking.

Is it possible to have two processes retrying forever with timeouts?
yes, but I think that is an issue for the caller - if you get a
LockError, and want to retry, its up to you to randomise a little.

> I'm fine moving it somewhere else. I do think the test has value on its
> own, though.

At this point, I dont - I think its testing something that can change
without harming anything in the program, and as such theres no need to
observe it with a test.

> I think your tests are a good addition. And probably better as a generic
> interface test. But they are testing something different than what I was
> testing.

Thats by design, see above ;).

> We don't have a good way with our adapters to say 'test *these*
> implementations, but not *these* implementations'. And I realize with an
> interface test, you really aren't supposed to do that.
> But we do have a cause for testing certain implementations, and DRY
> would indicate we shouldn't copy the tests. And you don't seem to like
> mixins...
> Any suggestions?

The classic way to deal with this is to have multiple defined
interfaces, which can extend each other.
'the working tree interface'
'the bzr native tree interface' extends 'the working tree interface'.

And then you have tests for 'the bzr native tree interface'.

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060716/180b69a4/attachment.pgp