Slowdowns

Tue May 23 14:14:56 BST 2006

Martin Pool wrote:
> On 22 May 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:
> 
>> So there is an advantage. For simple commands, it can be a lot. But for
>> something like 'bzr status', we are better off fixing other things.
> 
> My opinion is that measuring the time for no-op commands can be a useful
> technique but the constant time we actually care about measuring is that
> for simple operations in small trees - e.g. "bzr status", "bzr commit"
> etc in branches with zero or one versioned file.  If things like
> cElementTree and configobject are slowing us down but are necessarily
> involved in things like parsing the inventory we need to get to grips
> with them.  Conversely there's no good reason why things like doctest
> should be loaded in normal operation so we should fix that up.
> 

I completely agree that status/commit/add/etc are the important
commands, where we really want to spend our time.

It was really useful for me, to start with 'bzr rocks', just to dig in
and find what is really going on.

I started at rocks, then went to 'root', and then used 'status'.

I'm pretty sure that by the time you get to status, there is little
benefit to something like 'demandload'. You have to load just about
everything to get there.

With any luck, we might see some improvement in 'configobj' startup time.

I would love to get an improvement in 'import elementtree.ElementTree'.
Basically we need to change the one re.compile() line, so that it
happens later. We don't need the escape regex all the time, and it costs
us 25ms on startup.
The only painful one that we can't really get rid of is 'copy'. Because
'copy' imports 'inspect' which imports 'tokenize'. And 'tokenize' has
lots of regex in it. (startup time of ~40ms).

I can demandload copy() for all of our stuff, but cElementTree also
depends on it. Most of copy() only depends on the 'token' constants.
Only 'getblock()' depends on tokenize.tokenize, which actually needs the
regexes.
And even more unfortunately, the only thing copy() uses inspect for is
to call 'get_mro', which just does:
if hasattr(cls, '__mro__'):
  return cls.__mro__
else:
  recursively search __bases__

Which also doesn't need any of the regex stuff.
So if we implemented 'get_mro' into copy.py we could cut off 40+ms of
startup time. But that it python standard library stuff, so I don't know
any way of working around it. (Other than creating a copy.py that gets
gratuitously loaded in place of the standard lib copy.py)

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060523/2adf2309/attachment.pgp