bzr too slow

John Arbash Meinel john at arbash-meinel.com
Wed Jan 11 14:13:23 GMT 2006


Denys Duchier wrote:
> John Arbash Meinel <john at arbash-meinel.com> writes:
> 
> 
>>Denys Duchier wrote:
>>
>>
>>>The case of multiple concurrent read transactions also has problems: each
>>>transaction may update different parts of the hashcache.  Ideally when writing
>>>back the hashcache, we should only merge into the on-disk hashcache the entries
>>>that we have actually updated during the transaction.  In this manner,
>>>"up-to-date-ness" of the on-disk hashcache would be monotonic.
>>>[...]
>>
>>The hash-cache is only concerned with the state of the working
>>directory, which is not directly linked to the state of the branch.
>>You write the hash-cache when you have a read lock. Which means that two
>>bzr instances can grab a read lock, and they both will try to update the
>>hash-cache at the same time. And that would happen whether you use your
>>'run at cleanup' code, or if we use 'do we have the lock' code, or if we
>>use 'write when the lock goes away'. None of them fix the problem that
>>you think you are fixing.
> 
> 
> If you look at the quoted paragraph again you'll see that I understand the
> problem.  A "cleanup action" is part of my solution.  What such an action should
> do is not to blindly overwrite the on-disk hashcache, but instead update it with
> the entries that we have actually updated during our transaction.  Of course, we
> need mutual exclusion at this point (another lock for updating cached data).
> 

Well, you would have to read the file, and figure out which entries need
to be modified. And since the fields aren't fixed width, you can't
really replace inline. Even if you could, if you are interrupted while
writing, you end up with a corrupted hash-cache.

I suppose you could just append to the file, and define that the last
entry for a given path is the correct one. But you again have the
problem that now the file is always growing. And it really needs to be
squished periodically.

> 
>>I know I brought up the idea of code which gets run if commit() is
>>successful, and code that always gets run. Robert has some valid concern
>>over that sort of code. Primarily that Transaction isn't really where
>>that should be happening.
> 
> 
> So far, I think it is; because a transaction is a unit of coherence.
> 
> Cheers,
> 
> --Denys

But the working tree can be changing underneath you anyway. Someone
might be running patch in one window, and running a limited commit in
another window.
Or having two windows with 'bzr status' running. Pehaps running 'bzr
status foo' and 'bzr status baz'.
All of these cause race conditions with the hash-cache.

I like the idea of Transaction having 'run this at end'. But Robert has
a strong concern that people will start using it for important things
that should happen, and it isn't designed for that.
Really, a Transaction right now is just a working cache. I don't know if
it is going to become a real transaction at some time in the future.
(Such that commit() or rollback() is feasible).

I'm also not 100% satisfied with running a lot of operations in
unlock(), since unlock is generally run as:

lock()
try:
  do stuff
finally:
  unlock()

Which means that unlock() should never raise, because it would mask the
real exception.
What we are doing right now is a hack because we know unlock() is always
called. And we don't have a good place to put deferred actions.
And hashcache is hidden inside WorkingTree, so the upper layer that
knows when it is done with WorkingTree, doesn't know that there is a
hash-cache which might need to be flushed to disk.

And we don't trust __del__ methods, which is where I would put this sort
of code in C++.

Having Transaction with a queue of things to do at cleanup isn't
terrible. But Robert is the one who knows what the plan is for
Transaction. (He wrote it in the first place.) So he seems like a good
source for final decisions on how it should change.

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060111/9eb01209/attachment.pgp 


More information about the bazaar mailing list