RFC: startup time - again
David Cournapeau
david at ar.media.kyoto-u.ac.jp
Tue Sep 9 13:25:18 BST 2008
Robert Collins wrote:
> So, Martin and I just had an interesting chat. We were talking about
> startup time and lazy_import etc.
>
> startup time, such as 'bzr rocks' isn't a very useful metric for 'how
> slow is bzr to get going'. Much more useful is running a command with
> little work, and the same command with lots of work.
>
>
I don't understand why bzr rocks is not a very useful metric: on my
workstation, it takes 0.3 s for bzr rocks, to do nothing. This seems to
be the core issue for short time commands: what bzr rocks loads is a
subset of what is needed by 'normal' bzr commands, right ? If bzr rocks
was say 0.05 seconds instead (on my workstation, a doing nothing python
script is 0.02s), all the other commands would also be faster, no ?
> As an example, (figures from memory from that phone call)
> time bzr st bzr.dev -> 360ms
> time bzr st newly-inited-tree -> 320ms
>
> so 320ms to get going, 40ms to do the status itself.
>
> On that machine, 'time python -c "import sys"' -> 20ms
>
> So 300ms/360ms is in loading bzrlib code.
>
> There are lots of contributing factors to this, but the basic problem
> is:
> - we're loading too much code.
>
> The ideal minimum amount of code [while retaining object module and
> behaviour] to run status is roughly:
> - the transport factory & local transport module
> - the repository factory & pack repository check-the-format code
> - the branch factory & branch check-the-format code
> - _walkdirs_utf8
> - dirstate parser
> - the WT4 intertree module for dirstate<->basis tree
> - the cmd_status object
> - the encoding detection logic
> - logging to ~/bzr.log support
> - the bzr front end
>
I am not familiar with a lot of bzr code, but I am a bit surprised that
most of the time is in bzr code itself. For example, we recently boosted
import time performance for numpy, which is a big piece of code (with a
lot of C too; I think we have much more files and code than bzr), and it
loaded faster than bzr even before the boost (we went from 0.2 -> 0.1
seconds, again on the same machine). The problem was that some modules
in the stdlib are extremely slow to load (for example, I remember
inspect to be really slow: importing inspect take twice as much time as
launching python itself).
Do you have a list of the modules from the stdblib loaded by bzr (for
example, by bzr rocks) ? By quickly grepping into the bzr sources, I
could get a python script which does nothing but importing the modules I
got, and it already takes 1/3 of the time taken by bzr rocks.
cheers,
David
More information about the bazaar
mailing list