[MERGE] pyrex bencode implementation
John Arbash Meinel
john at arbash-meinel.com
Wed Aug 15 20:23:36 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Alexander Belchenko wrote:
> Aaron Bentley ?8H5B:
>> Alexander Belchenko wrote:
>>> Here is the patch for pyrex bencode version.
>>> I create simple benchmark for tags serialization/deserialization
>>> (i.e. indirect benchmark for bencode). This benchmark use
>>> tags dictionary with 100 items.
>>> Here results on my machine (CeleronM 1.7GHz Windows XP)
>>> Pure python bencode:
>>> 906ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_deserialize_tags
>>> 656ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_serialize_tags
>>> Pyrex version:
>>> 375ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_deserialize_tags
>>> 453ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_serialize_tags
>>> These numbers are for 1000-iteration loop, so you need to divide time by 1000,
>>> i.e. it's actually us not ms for 1 iteration.
>> BB:abstain
>
>> So that makes it .906ms for the python implementation and .375ms for the
>> pyrex implementation? That doesn't sound worth it to me.
>
>> Remember, increasing performance isn't about just optimizing anything we
>> can. Optimization always has a cost, usually in code clarity and
>> increased maintenance.
>
>> So optimization should start with profiling the code, seeing what parts
>> of what operation are slow, and then deciding the correct way to improve
>> performance.
>
>> For all I know, your performance win comes simply from avoiding function
>> call overhead, and that could be fixed without Pyrex.
>
> I think it's not quite true. In decode I also avoid numerous memory
> allocations for each intermediate value.
>
>> HACKING says a patch should "Improves bugs, features, speed, or code
>> simplicity"
>
> (Or scratch the itch of patch's author?)
>
>> This patch reduces code simplicity, and the speed increases don't seem
>> to be substantial.
>
> I think you're right. But in this case your vote is incorrect.
> I expecting bb:reject.
>
Well, reject is indicating that he doesn't want it enough to ask that
nobody else merge it.
His actual statement is more of a "I don't think this is worthwhile." In
the old voting it would be -0 or -0.5 rather than -1 (reject).
Certainly I would avoid doing a reject until enough people had discussed
that we actually came to a bit of a consensus.
My general feeling is that I would really like to keep this code around,
in case we start using bencode a lot more (it seems like we will for
packs, etc). But that it probably isn't worth merging yet.
So if you can put it on Launchpad, so that we can always have access to
it if we find that bencode starts showing up on our profiling, then we
could certainly bring it in.
I haven't looked closely at the code though. So there may be some
obvious places that we could tweak to make it faster.
One thing I found very useful was to look at the generated C code. It is
a bit crufty, but once you get used to it, you can start spotting things
you didn't expect.
Like all of a sudden it is treating your Pyrex object as a plain Python
Object because you forgot to declare the type. Or you find it is using
getattr() and building up tuples to place function calls, because you
have the other function as a "def foo" rather than a "cdef foo". (I will
frequently factor out the real function into a cdef _func, and then
write a def func wrapper to call it from python, but have the C function
available for when I'm already in C code).
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGw1K4JdeBCYSNAAMRAk1HAKDLPUI5vDedPDpdzQ6aMsOdMB7QvwCfbhwG
e0ls0v1esM9l0hztn2NxOBE=
=LGdY
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list