RFC: startup time - again

Robert Collins robertc at robertcollins.net
Tue Sep 9 11:25:08 BST 2008


On Tue, 2008-09-09 at 10:50 +0100, Russel Winder wrote:
> 
> On Tue, 2008-09-09 at 19:22 +1000, Robert Collins wrote:
> > So, Martin and I just had an interesting chat. We were talking about
> > startup time and lazy_import etc.
> 
> I don't have anything concrete to contribute to the core of this
> issue,
> I would just say that for the average user a startup time < 0.5s is
> probably in the noise: if the startup time was 2s then there would be
> an
> issue but < 0.5s is likely no problem.  So whilst getting startup time
> down is good, I would suggest that making overall elapse time of the
> common commands for common size branches as short as possible should
> be
> the highest priority.

So command performance matters in many ways. For instance, 'bzr st' on
an empty branch is 16 seconds for me - with cold cache. Much of that 16
seconds is loading code we don't use -> shrinking the amount of code we
load will improve performance in that case (quite dramatically I would
expect).

Secondly, there is a lower threshold, over which a command feels slow.
0.3 seconds feels slow. 0.1 second doesn't. If we sum the times:
0.04 seconds to do the status
0.02 seconds to load python
<something> to load the actually used code
we have a pretty good shot at getting down to something that feels
dramatically faster

And, as project sizes scale up, yes, startup time becomes more 'noise'.
But you still get folk 'testing' how long subsecond commands take.

Of course startup time doesn't matter when you consider minutes long
operations like pull, which are being working on.

One limiting factor on speed though, is python. Each function we call on
a path in a 100K tree takes 60ms of the total time of the operation, on
my core 2 duo laptop. Thats 15 function calls *total* to process every
path in the tree, or the operation is > 1 second.

stat is one call
stat_fingerprint is another
list.append(path) is another
lookup of the old fingerprint is another
etc - we hit 15 calls extremely quickly.

Moving code to C can help with this by providing a substantially lower
cost-to-call, but its got limitations on testing and profiling the way
we do it at the moment, and code maintenance and integration is
important, particularly if the C code base is going to increase.

Manipulations to the way bzr starts up don't have much impact on
performance outside startup; but things that reduce the code loaded
could well provide a solid basis for getting more robust C integration -
which could actually lead to more maintainable and cleaner code, because
we wouldn't be fighting against python performance to write a fast
program.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080909/2d64b31c/attachment.pgp 


More information about the bazaar mailing list