RFC: startup time - again

Robert Collins robertc at robertcollins.net
Thu Sep 11 02:13:29 BST 2008

On Wed, 2008-09-10 at 12:26 +0100, Adrian Wilkins wrote:
> Apologies for big post, it just poured out.... you RFCd, you're getting Cs.

Thank you - I do appreciate this, I'm sometimes dissappointed in my
RFC's, but not in this one - wow, tonnes of comments.

> The obsession about startup time... is this just a pissing contest with git?

I wish. The big thing with git, in terms of problematic pissing
contests, is network speed, and in terms of problematic-for-git contents
is usability; we have no chance on raw startup speed to be like git. Not
without a lot of sacrifices I'm frankly uninterested in making. So no,
its not about that.

> The performance improvements of Bazaar on Win32 recently have been
> enormous, despite startup time being far worse there.
> I started using Bazaar somewhere around 0.9-ish. Back then Bazaar beat
> SVN for similar operations by several integer multiples. On 1.6, it
> totally blows it away. I'm still rather envious when I work with the
> same branches on Linux and it's so much snappier, but 1.5 --> 1.6
> removed a lot of that envy.

Thats fantastic to know. 

> For me, the real performance of Bazaar is not in whether it reports on a
> `bzr st` before I've lifted my finger from the enter key. It's in all
> the work it does for me. It's in the merging support. It's in the fact
> that it's written in a language that I could pick up and hack on without
> being terrified of what I'd do to it. (My C experience is limited).

Indeed. And keeping a good balance is important.

Its a fact today that the inner loops of the most performance critical
parts of bzr are usually executing raw C/pyrex code. our sequence
matcher is C, the btree index based formats coming online at the moment
use C to parse the index, the performance improvements on  windows that
you speak of... are C.

> > C
> Please don't consider going to C for performance reasons.

I think something has got lost in translation; I am not interested in
ditching python - I'm interested in making sure that the use of C we
have is better and less adhoc, un-detail-tested, unprofilable and
frankly annoying. I want to make that work better; and I can see some
possible advantages in allowing it to spread [while still keeping
reference python only versions of everything].

> > When optimising one normally looks at the most expensive part of the
> > process - which startup is, for modest code bases (like bzr).
> Yes, startup time matters. But I'm much more concerned about the
> performance of actual operations. I'm overwhelmingingly more interested
> in spending time on features that let _people_ work less, not the
> computer. Given my experience with VSS, CVS, and SVN, and the size of
> the trees I'm working on, I think you could add a 1 second delay loop to
> the startup and I'd still be overwhelmingly happy with the performance.

I wouldn't be :).

> The most expensive part of the process is programmers time, and Bazaar
> scores best by consuming less of it. The largest component of this is
> not the latency between hitting enter and doing useful work. It takes
> more than 100ms for most people to type a character - you could
> justifiably claim that the "st" alias for the status command is a 16x
> better optimization than scratching 10% off a 250ms startup overhead
> because it saves the user four keystrokes.

:). Direct console use is only one use of bzr; other users like guis,
IDE's, web services also have the need to use bzr smoothly, and
_requiring_ that they run a client-server model to work well is a bit
ugly. See Nicholas Allen's eclipse stuff for instance, I believe he's
running bzr in a subshell - adding a second to startup time would
probably make that unfeasible, and taking a half second off will make
the eclipse UI feel that much snappier.

> I'd much rather see people spending time on, for example, pluggable
> merging support, which would save my users far, far more time than the
> interval between their finger leaving the key and Bazaar getting to
> work. The majority of topics on this list receive far fewer column
> inches than this one and are are generally of much greater importance.

Well, right now I'm choosing to work on performance [in general] on
large trees. Startup time is getting in my way because its adding so
much noise I have to use reallllly large trees to get sensible results.
Which makes the change-execute-evaluate cycle longer. Which is why I
mentioned it to Martin on the phone, and our discussion was interesting
- thus this thread :).

Also note that startup for me, cold cache, is 16+ seconds. Thats *huge*
and its totally fixable - by making start up for actual useful commands
*do less*.

> ---- thoughts about the Windows version
> > under Windows it is about 500ms.

> Then of course, I download the latest IronPython beta and lament how
> trivial it is to get bzrlib to break it.

Clean patches to improve our compatibility are totally welcome :>.


GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080911/7485937f/attachment-0001.pgp 

More information about the bazaar mailing list