Bzr's startup time

John Arbash Meinel john at arbash-meinel.com
Mon Jun 26 14:36:13 BST 2006


Matthieu Moy wrote:
> Martin Pool <mbp at canonical.com> writes:
> 
>> It is highly impressive, considering it's Python.  John's cool
>> --profile-imports option (now in bzr.dev) gives us a tool for
>> finding modules and other things that slow down startup, and we're
>> working on eliminating them.  A longer term change is the
>> integrated working directory state that Robert posted about recently. 
> 
> I think the large number of python imports at the beginning of
> builtins.py is also a problem.
> 
> For example, to run "bzr version", I need to load builtins.py which in
> turn will load many thing in bzrlib to just issue a trivial message.
> Similarly, some local operations (bzr ignore, bzr add) will need to
> import Branch.
> 
> It would probably be wise to split the command functions in
> builtins.py into a trivial function doing only the argument parsing,
> and calling a function in another file, which in turn would do the
> correct imports and perform the actual job.
> 

(All these numbers were created on a fairly fast machine, YMMV).

Actually, there are bigger problems than just 'builtins.py'. For example
'ConfigObj' used to take 130+ms to import, because it was importing its
validation routines.
We fixed that, but it still takes 30+ms to import. And we use that to
read our ~/.bazaar/bazaar.conf, which means we have to import it if we
want to support command aliases. (Not that we couldn't use a different
parser, but that is a different discussion).

And then loading cElementTree also takes > 100ms. But only because it
imports 'copy', and elementree.ElementTree. The latter is compiling a
regex which takes 35+ms to compile, and the former imports inspect which
imports tokenize, which also compiles a bunch of regexes.
(If you hack around a little bit, cElementTree only takes 8ms to import
itself).

I've investigated it a lot. I have a branch up (demandload), and I
looked pretty closely at what is costing us time.
For example, importing 'subprocess' actually imports 'pickle' which
costs 10+ms.

There is a fairly decent module that the hg guys use, called
'demandload'. Basically it creates an object that will replace itself
with your module once you actually use it.

I went through and just started switching everything to use it as much
as possible. At the time '--profile-imports' wasn't as nice as it is
now, so it was a little bit tricky to see what was costing us so much.

After all my tricks, I managed to bring 'bzr --no-plugins root' time
down from 360ms to 220ms.
Not as much of an improvement as I would have liked, considering the effort.

Now look at a different tack... My 'service' plugin launches bzr in a
server process after importing all of bzrlib. It then forks a new child
(which then wouldn't have to import anything new) when connected to by a
(python or compiled C) client.
It changes the time for 'bzr root' down to 50ms. (and bzr rocks finishes
in 10ms).

So if you really want blazingly fast runtime, we may think about some
real wizardry.

Anyway, I've been looking into it a lot. And I haven't quite figured out
what is the least evil thing to do. For example, copy only uses
'inspect' for a semi-trivial function, and definitely doesn't need
inspect to load tokenize.

Or we can monkey-patch 're.compile()' to return a proxy object that
doesn't compile itself until necessary. We need to think about how we
play with other people wanting to use bzrlib.
I'm tempted to have a few evil hacks in bzrlib that don't install
themselves by default, but the 'bzr' front-end script turns them on.
Then someone just using bzrlib doesn't have any problems, unless they
explicitly turn them on. Though it does mean the bzr core starts
functioning different from what 3rd party clients would do....

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060626/0dc65896/attachment.pgp 


More information about the bazaar mailing list