[MERGE] log|less 590% to 727% faster

John Arbash Meinel john at arbash-meinel.com
Sun Jun 18 02:26:00 BST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> John Arbash Meinel wrote:
>>>>> IIRC, the issue is that we want to avoid importing tokenize, because of
>>>>> its various and funky regexes.  Which we should be able to accomplish
>>>>> using this method, right?
>>> Yep. I specifically did:
>>> demandload(globals(), 'inspect tokenize')
>>> sys.modules['inspect'] = inspect
>>> sys.modules['tokenize'] = tokenize
>>>
>>> The way this demandload works is that when someone grabs an attribute,
>>> it replaces itself. There is a problem with my module hacking version,
>>> which means that 'copy()' will always be going through an attribute to
>>> get to 'inspect'.
> 
> I thought the only problem with inspect was that it imported tokenize,
> which demandloading would fix.  No?

Right the problem with copy is that it loads inspect which loads
tokenize. The only thing copy uses 'inspect' for is to call 'getmro',
which is just

def getmro(cls):
    if hasattr(cls, "__mro__"):
        return cls.__mro__
    else:
        result = []
        _searchbases(cls, result)
        return tuple(result)
def _searchbases(cls, accum):
    if cls in accum:
        return
    accum.append(cls)
    for base in cls.__bases__:
        _searchbases(base, accum)


Which doesn't need tokenize at all. But 'inspect' itself uses tokenize
for a few character strings, and at one point actually calls
'tokenize.tokenize()' which actually would use all those regexes.

So yes, forcing copy to demandload inspect, and inspect to demandload
tokenize should help us never load tokenize.

My concern is that we not do it in an evil way that will break someone
who wants to use bzrlib and needs to do something with inspect/tokenize.

In a very real sense, copy doesn't need 'inspect'. The function it uses
is extremely simple.
I would even consider doing something like:

import bzrlib.simple_inspect
sys.modules['inspect'] = bzrlib.simple_inspect
import copy
del sys.modules['inspect']

In some ways that is less evil that using demandload.

> 
>>> Yeah, I agree. I don't think my demandload branch is good for merging.
>>> But I think we can look into doing it right now that I have experience
>>> with what is actually costing us time.
>>>
>>> The first pass would be to include my custom importer, which lets us run
>>> 'bzr --profile-imports foo', and records the amount of time compiling
>>> all regexes, and importing objects.
> 
> That sounds very useful.
> 
> Aaron

It helped me track stuff like 'copy' down. I'll post it for review again.

The biggest complaint I have is that for it to be truly useful, it needs
to happen as early as possible, so it should really happen in the 'bzr'
front-end script.
I suppose we could actually create a different front-end script which
installs the custom importer, and then just thunks over to bzr.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFElKuoJdeBCYSNAAMRAj0HAJ9WqWnpz3ZGvdm4rqM9KSe9SWJjMwCeIm+C
GIQcld3Obnmn4MGOvOA63/I=
=yaKq
-----END PGP SIGNATURE-----




More information about the bazaar mailing list