[MERGE] import both c-extension and python module with the same name for testing
Alexander Belchenko
bialix at ukr.net
Tue Aug 7 02:08:03 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins пишет:
> On Mon, 2007-08-06 at 23:33 +0300, Alexander Belchenko wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Today I had conversation on IRC with spiv and Peng.
>> We discussed with spiv my pyrex bencode and then started to discuss
>> new strategy of bzrlib for pyrex/c extensions modules.
>>
>> Because we want to tests all versions (python and pyrex) separately,
>> John for his _knit_load_data pyrex extension did such things:
>>
>> 1) python version of module named _load_data_py.py
>> 2) pyrex version of module named _load_data_c.pyx
>> 3) Then importing one of them (in knit.py):
>>
>> try:
>> from bzrlib._knit_load_data_c import _load_data_c as _load_data
>> except ImportError:
>> from bzrlib._knit_load_data_py import _load_data_py as _load_data
>>
>> If we need this zoo only for testing, then we could provide more complex
>> importing scheme for test suite, but simplify core code.
>
> Well, theres no need for other modules to import the fastpath code. In
> actual fact I had this basic approach in my very early pyrex
> experiments, but we moved away from it.
I don't say that particularly _load_data code is slow.
But for some modules keep 2 versions with different names might be overhead.
Let's imagine next situation:
wrapper foo.py that try to import either _foo_c.pyx or _foo_py.py.
All another modules in bzrlib imports only foo.
Because of caching imported modules in Python this code will be executed only once:
try:
from _foo_c import *
except ImportError:
from _foo_py import *
So, time for `import foo` will be:
time to import foo.py
+ time to try to import _foo_c
+ (probably) penalty for exception
+ (probably) time to import _foo_py
It seems to be very small time. But according to
http://www.jacobian.org/writing/2007/mar/04/hate-python/ (section 2):
"...the way Python’s import mechanism works; importing a package makes around ten different open
syscalls for each entry on sys.path; that is, import foo looks for:
* foo.so
* foomodule.so
* foo.py
* foo.pyc
* foo.pyo
* foo/__init__.so
* foo/__init__module.so
* foo/__init__.py
* foo/__init__.pyc
* foo/__init__.pyo
"
So in the worst case there will be 16 syscalls for 3 different files and penalty for except.
I don't know how to measure such small values, especially on win32. May be it will cost
about a couple of hundred microseconds? I don't know.
>> Instead of keeping py and c versions with separate module names, we could
>> create them with the same module name (_load_data.py and _load_data.pyx for example
>> above). At runtime Python interpreter will import c-extension first if any
>> presents. And for testing we need to use special approach to import
>> python version. This patch provide mechanism to achieve this goal.
>>
>> The main reason for this patch is faster import of modules: we let to Python
>> interpreter choose appropriate version of module and get rid of
>> try/except ImportError construct.
>
> I don't believe there will be any difference in performance when the C
> version is present. I'd like to see test results on the performance
> difference when the C version is not present.
I don't know how to measure such small values.
Real gain that I see in using proposed approach is to simplify code
and reduce amount of unnecessary wrappers.
I feel that my arguments are weak.
- --
[µ]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGt8XzzYr338mxwCURAsWyAJ9QUlpcYw2X0Izcea4e+39KQEo63ACglLO2
LVYQKQk0VJnxRLxkuXD7bcE=
=CtCy
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list