bzr too slow

Tue Jan 10 23:23:16 GMT 2006

On Tue, 2006-01-10 at 13:08 -0600, John Arbash Meinel wrote:
> Denys Duchier wrote:
> > John Arbash Meinel <john at arbash-meinel.com> writes:
> > 
> > 
> >>Are you actually seeing the file getting written out repeatedly? If so,
> >>I'll +1 a patch.
> > 
> > 
> > I have a patch available as revision 1516 in the branch at:
> > 
> >            http://delta.univ-orleans.fr/~duchier/bzr/bzr.call_at_end
> > 
> > 
> > I can confirm that this was _the_ major bottleneck for "bzr status".
> > Previously, when running "bzr status" on my linux tree after I had touched all
> > the files, I had to interrupt the process after 84mn.  With this patch, it
> > completes in 4mn 51s.
> > 
> > The patch passes the test suite.
> > 
> > The idea of the patch is to extend transactions to allow registering callbacks
> > to be executed at the end of the transaction.

This might be appropriate for other things, I'm not sure its the right
thing here. The core problem is that the branch and working tree locks
are not yet separated.

I would do 
=== modified file 'bzrlib/workingtree.py'

--- bzrlib/workingtree.py
+++ bzrlib/workingtree.py
@@ -914,9 +914,11 @@
         between multiple working trees, i.e. via shared storage, then
we 
         would probably want to lock both the local tree, and the
branch.
         """
-        if self._hashcache.needs_write:
-            self._hashcache.write()
-        return self.branch.unlock()
+        result = self.branch.unlock()
+        if self.branch._lock is None:
+            if self._hashcache.needs_write:
+                self._hashcache.write()
+        return result


Which is completely trivial and does not depend on the transaction layer
at all - transactions are a higher level consideration than this I
think.

> Normally call_at_end functions are created in a LIFO queue, so that when
> you do some sort of setup, it cleans itself up at the end.
> I see you already have a check that the function isn't added more than once.


> But I like the idea of having 'cleanup' functions registered with the
> transaction, which can happen when the transaction is finished.

Its got pluses and minuses for me. It adds a level of unpredictability
to the transaction concept - I can easily imagine it being abused. And
we know when the transaction finishes at the moment in unlock.... the
issue as I see it is not having a real lock on the working tree -
because the branch unlock is completely unrelated to the working tree
unlock, as is the branch's transactional success/failure.

> I'm also concerned about the semantics of finish() versus commit(). Do
> we always call Transaction.finish() whether we commit() or abort()?
> Do we need a separate callback queue for things to run if the
> transaction succeeds versus what happens if it is canceled?

If we do completion events, then I would be very much *against* success
vs failure cleanups - they give too much of a 'use this to do important
things' feeling. Maybe just a silly concern but ...

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060111/902480be/attachment.pgp