HPSS status from the NL sprint...

Robert Collins robertc at robertcollins.net
Wed Feb 7 04:39:38 GMT 2007


On Wed, 2007-02-07 at 15:27 +1100, Andrew Bennetts wrote:

> I'd love to get the tests on that branch passing ASAP.  It's a real pain working
> on a branch where >100 tests failing is the norm, and I imagine it's an even
> larger barrier for potential new contributors to that branch.
> 
> I think next time I'd try harder to have a branch that's always closer to a
> state that could be acceptable for merging to bzr.dev.

Its tricky with large work, but yeah - having tests passing is quite
important.

> > As far as I can tell, the following things are all that are needed to
> > get to phase 1:
> >  * RemoteBranch paramterisation. At the moment the
> > branch_implementations tests dont run test for RemoteBranch. This should
> > be quite straight forward to hook up, requiring the same changes done to
> > the repository_implementations logic.
> 
> I already did this a couple of days ago.  Try "bzr selftest
> branch_implementations.*Remote" for example.

Cool - I've been travelling, so ... didn't see it :).

> >  * LockDir reuse. Because the server runs small transactions, but we
> > want to hold locks open longer, we need a way to tell the server that
> > the lock held on e.g. the Branch, is the one we created, and also that
> > it should be able to attach to that lock. This is quite easily done by
> > having lockdir be able to provide a token for the lock, and extending
> > the lock_write methods on Branch and Repository to allow passing in a
> > token. Finally we need the RemoteBranch and RemoteRepository objects to
> > acquire this token and pass it around as needed. I think for efficiency
> > it should look something like [pseudo code]:
> >   branch_token, repository_token = self.client.call('lock this branch')
> >   if self._real_branch is not None:
> >       self._real_branch.lock_write(branch_token=branch_token,
> > repository_token=repository_token)
> > 
> > That is, that the smart server protocol should perform the locking, and
> > no individual lock steps should cross the wire.
> > 
> > We probably need to allow this call to include an already acquired
> > repository token, for when we have two branches pointing at the same
> > repository, and lock both.
> 
> Is this actually needed to make tests pass?  Why wouldn't providing the
> existing lock_read/lock_write methods over the smart protocol be enough? 
> 
> I don't feel like I have a very solid understanding of the problem here.  Is
> there a test case demonstrating the problem?

Consider methods on RemoteRepository which require a write lock.

If the method is implemented by doing a self.client.call(), where the
remote end does 'repository.lock_write(); repository.method();
repository.unlock()' (as many needs_write_lock decorated methods do),
then we can make this fail on the client by locally doing:

repository.lock_write()
repository.method()
repository.unlock()

because, repository.lock_write invokes
self._real_repository.lock_write(), which places the physical repository
into 'lock held' .

then when the remote repository tries to do repository.lock_write(), the
lock will fail : the physical repository is already locked.

Essentially its a conflict between our stateful repository API which
allows 'locking', and the stateless model of the smart server protocol,
which requires all transactions to complete in a single method call.

Having a token that the client can maintain to let the server get back
to speed lets this work.

> >  * RemoteBranch and RemoteRepository lock context caching. Currently
> > there is no caching of data within a transaction for these classes,
> > which there is for real objects. We should wire this up - see
> > LockableFiles for the core of the logic - and then use it judiciously.
> > An example is getting the branch config file - thats a round trip every
> > time branch.get_config() is called, and its not needed. (in this case,
> > its currently also a problem for regular branches, but most things are
> > not:).
> 
> This sounds like a performance issue, not a correctness one.  Is this related to
> getting passing tests, or merely nice to have?

I strongly suspect correctness will be involved; but certainly we dont
want the hpss branch to be a downgrade :).

> >  * Slug through the tests. This is basically ensuring there is either a
> > real_object based implementation, or adding a verb to the smart server
> > to fast-path a specific request. This is very easy now, with the
> > refactoring we did during the sprint.
> 
> I'd in fact be inclined to freeze adding new verbs in favour of focussing on
> getting tests passing.  At the moment I think there's a great risk that
> incidental breakage in adding these new features might not be noticed quickly
> because of the dire state of the test suite.

Scratch thy own itch :). I'd rather have folk working on all parts of
it, than holding off - and as the test suite comes online it will
improve. Now that RemoteBranch is running conformance tests, I'd suggest
we look at saying 'do no harm' as the policy: as long as the failure
number drops before and after a commit; lets go with it.

Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070207/403fea94/attachment.pgp 


More information about the bazaar mailing list