RFC: startup time - again
adrian.wilkins at gmail.com
Wed Sep 10 12:26:45 BST 2008
Apologies for big post, it just poured out.... you RFCd, you're getting Cs.
The obsession about startup time... is this just a pissing contest with git?
The performance improvements of Bazaar on Win32 recently have been
enormous, despite startup time being far worse there.
I started using Bazaar somewhere around 0.9-ish. Back then Bazaar beat
SVN for similar operations by several integer multiples. On 1.6, it
totally blows it away. I'm still rather envious when I work with the
same branches on Linux and it's so much snappier, but 1.5 --> 1.6
removed a lot of that envy.
For me, the real performance of Bazaar is not in whether it reports on a
`bzr st` before I've lifted my finger from the enter key. It's in all
the work it does for me. It's in the merging support. It's in the fact
that it's written in a language that I could pick up and hack on without
being terrified of what I'd do to it. (My C experience is limited).
Please don't consider going to C for performance reasons. If Bazaar had
been in C, you would not have got patches from me ; I contribute to
Bazaar because it supports my project and because I don't want to
maintain my own branch, not purely out of altruism. If Bazaar had been
in C, I may even have not adopted it - regardless of the enabling
features, I need certain things to work that just did not work on
Windows at the start. I knew very little Python before I used Bazaar,
but I knew enough to be confident that I could resolve my current and
future issues with the software in a timely manner.
Without Bazaar, I know my project would have ended up failing. I would
have had to waste my time re-inventing the wheel and writing
merge-tracking support for SVN. Even if I'd leveraged SVK, the sheer
tedium of VCS operations on our production working tree with SVN/SVK
would have increased my iteration and testing time enormously. As it is,
I've reached my first milestone, and the users (who are not technical)
are in general happy with the results, although more GUI sugar would
please them. For reference, their old tools consume about 10 minutes
doing the equivalent of "cvs up" _each time_ they start them.
> When optimising one normally looks at the most expensive part of the
> process - which startup is, for modest code bases (like bzr).
Yes, startup time matters. But I'm much more concerned about the
performance of actual operations. I'm overwhelmingingly more interested
in spending time on features that let _people_ work less, not the
computer. Given my experience with VSS, CVS, and SVN, and the size of
the trees I'm working on, I think you could add a 1 second delay loop to
the startup and I'd still be overwhelmingly happy with the performance.
The most expensive part of the process is programmers time, and Bazaar
scores best by consuming less of it. The largest component of this is
not the latency between hitting enter and doing useful work. It takes
more than 100ms for most people to type a character - you could
justifiably claim that the "st" alias for the status command is a 16x
better optimization than scratching 10% off a 250ms startup overhead
because it saves the user four keystrokes.
I'd much rather see people spending time on, for example, pluggable
merging support, which would save my users far, far more time than the
interval between their finger leaving the key and Bazaar getting to
work. The majority of topics on this list receive far fewer column
inches than this one and are are generally of much greater importance.
---- thoughts about the Windows version
> under Windows it is about 500ms.
AFAIK this is mostly because of the monstrous overheads of process
creation and teardown on win32. I've written *nix style pipe-process
utilities that do little things on win32 only to be flabberghasted as
process overhead accounts for over 50% of runtime ; just don't do it,
especially for .NET executables. git has efforts specifically to
"libify" as many of the discrete executables to reduce this phenonmenon
This is one of the reasons that occasionally I think that Bazaar on
IronPython would be really fantastic ; you might get the much touted JIT
compiled performance improvements (although I think a lot of that would
be forestalled by the real problem in win32 which is filesystem
performance), but you could also make it a Powershell snap-in, which
means it would be loaded in process on first use (or before?) and it's
startup time could possibly make the Linux version look a little sick. A
more object-y "builtins" would complement it nicely ; return status
objects from status calls, etc, and pipe them down the object-pipeline
that makes PoSH a contender for best Windows shell.
Then of course, I download the latest IronPython beta and lament how
trivial it is to get bzrlib to break it.
More information about the bazaar