[MERGE] log|less 590% to 727% faster

Sun Jun 18 00:05:25 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> John Arbash Meinel wrote:
>>> Aaron Bentley wrote:
> 
>>>>> I think I'm happy with 44 milliseconds for the first screenfull.  Merges
>>>>> shouldn't affect that very much.  Now if only we can pull down that
>>>>> startup time...
> 
>>> Actually, we could just mess with the import of 'inspect' for 'copy'.
>>>
>>> I just played with something, and I found a way that 'demandload()' can
>>> actually hack-into sys.modules so that other modules actually load a
>>> demandload version.
>>>
>>> This semi-evil hack is available from here:
>>> http://bzr.arbash-meinel.com/misc/demandload_through_sys/
> 
> Your server appears to be down.

No, it just changed IP address, and I have my TTL set to about 1 hour.
At most it should take one day for it to resolve again (stupid DSL, I
wanted to pay for better service, but they don't offer it in my area).

> 
>>> Basically, you can do:
>>>
>>> demandload(globals(), 'inspect foo bar')
>>> sys.modules['inspect'] = inspect
>>> sys.modules['foo'] = foo
>>> sys.modules['bar'] = bar
>>>
>>> And then anyone else who tries to do 'import foo' actually gets the
>>> demandload object until they access an attribute.
> 
> IIRC, the issue is that we want to avoid importing tokenize, because of
> its various and funky regexes.  Which we should be able to accomplish
> using this method, right?

Yep. I specifically did:
demandload(globals(), 'inspect tokenize')
sys.modules['inspect'] = inspect
sys.modules['tokenize'] = tokenize

The way this demandload works is that when someone grabs an attribute,
it replaces itself. There is a problem with my module hacking version,
which means that 'copy()' will always be going through an attribute to
get to 'inspect'.

I suppose I could look into frames, such that when you access the
attribute it figures out what the calling frame was, and replaces itself
in that frame. A little bit more evil, but if it works ....

> 
>>> If you look at my demandload branch of bzr, I played with it, and I now
>>> have the time for 'bzr --no-plugins root' down to 240 ms
>>> These are the overall timings:
>>>
>>> 		rocks	root
>>> original	493	504
>>> demandload	148	281
> 
> Nicely done!
> 
>>> And that is even without '--no-aliases' now that we aren't loading
>>> validate.py. (It still takes 35ms to load configobj, but that is down
>>> from 135ms).
> 
> That's still a significant fraction of startup time, so I guess we
> should look at optimizing it.  Probably best to get the latest version,
> and send our hackery upstream.

Yeah. Without my latest hackery, importing cElementTree cost us 120ms,
some from tokenize, and some from elementtree.ElementTree. ElementTree's
regex is only used when writing (which we probably won't do anymore),
and tokenize is only used by inspect (used by copy) under specific
circumstances.

> 
>>> So now that I've played with it some, we might actually want to
>>> considered adding demandload into the core.
>>> I realize it is a little evil to do my monkeypatching demandload, but it
>>> can shave off as much as 100ms from the startup time. Which can be 1/2
>>> the time we spend.
> 
> You can really feel the difference between .141s (hg --version) and .988
> (bzr rocks).  I'm willing to be evil to get there.  We have test cases,
> after all.
> 
> Aaron

Yeah, I agree. I don't think my demandload branch is good for merging.
But I think we can look into doing it right now that I have experience
with what is actually costing us time.

The first pass would be to include my custom importer, which lets us run
'bzr --profile-imports foo', and records the amount of time compiling
all regexes, and importing objects.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFElIq1JdeBCYSNAAMRAtOmAKDGkSD4g3A0T/TjI1+gP/mMvqrWSQCdGgwK
2DO8IZXNIj2YJafaFfqWMeo=
=4O3c
-----END PGP SIGNATURE-----