[RFC] make 'copy' load quickly

John Arbash Meinel john at arbash-meinel.com
Tue Sep 12 23:45:34 BST 2006

We've known for a while that one of the slowest things to import is
'copy'. This is because it depends on 'inspect' which depends on
'tokenize'. However, the 'copy' only uses 2 functions out of 'inspect',
and they are practically trivial.

I put together this patch as part of my lazy_import stuff. On my
machine, this is the difference:

bench_rocks.RocksBenchmark.test_rocks  OK   322ms/  407ms

bench_rocks.RocksBenchmark.test_rocks  OK   201ms/  308ms

lazy_import hacked copy:
bench_rocks.RocksBenchmark.test_rocks  OK   165ms/  254ms

So as near as I can tell, I'm shaving 40ms off of the startup time, just
by avoiding importing the real tokenize.

We can't get rid of importing copy, because everyone and their cousin
imports it. (cElementTree imports it, optparse imports it, half of our
modules use deepcopy() for something).

Basically, I'm looking to start adding more stuff like this, where the
'bzr' front-end knows about workarounds to make startup faster. And so
regular users of the 'bzrlib' api won't get hacked versions of anything,
but 'bzr' can, because we know what it needs.

(In the specific case of 'copy', I'm using the exact code from inspect,
so the final results should be 100% compatible with the original 'copy'.
But I might start doing some other stuff to the regex module, which
would be riskier to load all the time)

I also wonder if we should submit something like this to the python dev
guys, since it seems like a bug to have 'copy' take 40ms to load because
it needs 2 trivial functions from the 'inspect' module.

