[MERGE] pyrex bencode implementation
Alexander Belchenko
bialix at ukr.net
Wed Aug 15 20:01:18 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley пишет:
> Alexander Belchenko wrote:
>> Here is the patch for pyrex bencode version.
>
>> I create simple benchmark for tags serialization/deserialization
>> (i.e. indirect benchmark for bencode). This benchmark use
>> tags dictionary with 100 items.
>
>> Here results on my machine (CeleronM 1.7GHz Windows XP)
>
>> Pure python bencode:
>
>> 906ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_deserialize_tags
>> 656ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_serialize_tags
>
>> Pyrex version:
>
>> 375ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_deserialize_tags
>> 453ms bzrlib.benchmarks.bench_tags.TagsBencodeBenchmark.test_serialize_tags
>
>> These numbers are for 1000-iteration loop, so you need to divide time by 1000,
>> i.e. it's actually us not ms for 1 iteration.
>
> BB:abstain
>
> So that makes it .906ms for the python implementation and .375ms for the
> pyrex implementation? That doesn't sound worth it to me.
>
> Remember, increasing performance isn't about just optimizing anything we
> can. Optimization always has a cost, usually in code clarity and
> increased maintenance.
>
> So optimization should start with profiling the code, seeing what parts
> of what operation are slow, and then deciding the correct way to improve
> performance.
>
> For all I know, your performance win comes simply from avoiding function
> call overhead, and that could be fixed without Pyrex.
I think it's not quite true. In decode I also avoid numerous memory
allocations for each intermediate value.
>
> HACKING says a patch should "Improves bugs, features, speed, or code
> simplicity"
(Or scratch the itch of patch's author?)
> This patch reduces code simplicity, and the speed increases don't seem
> to be substantial.
I think you're right. But in this case your vote is incorrect.
I expecting bb:reject.
May I cite my previous posts about this implementation (5 and 7 August)?
"I started to dive into Pyrex, and decided to write something simple and useful
for Bazaar project. I chose bencode because it's simple algorithm,
and plus in the past Martin said that we could use bencode for VersionedProperties.
I use benchmarks from BitTorrent-bencode-5.0.8 package. With generic benchmark data
I have about 5x-6x faster decode and more than 2.5x faster encode:
version decode encode
python 30.78ms 18.75ms
pyrex 5.31ms 7.03ms
For VersionedProperties I think I need rework benchmark to use dicts with
several (2-3) keys. Probably for shorter bencode strings decode in python will go
fast enough, so speed difference will be smaller.
But I think even 2-3x is better than nothing.
Of course for tags this difference will be too small, but for 10K-50K files
with VersionedProperties attached to entries in inventory we should have big win."
I have nothing to add here, except maybe my benchmark is awfully bad.
I'm not planning to speed up tags, they already fast enough.
I also asked about kind of data that serialized with bencode by another
part of bzrlib but nobody answer me:
"As I can see from grep output bencode currently used
in tag.py, multiparent.py and bundle/serializer/v4.py.
I know how it used in tag.py.
Can someone give me quick shot what kind of data bencoded in
multiparent and v4-serializer? What typical amount of data
and which type they are?
I want to write benchmark reflecting current usage of bencode
in bzrlib."
/me on half way to vacation.
- --
[µ]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGw01+zYr338mxwCURAhRyAJ0TndXUmZkonZCTLcvmm5F8PLCBQACdF66N
kZi6GB+WQOHWK1KmQ0wRn/8=
=AwQb
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list