[MERGE] rand_chars() optimization

Sun Mar 11 15:09:15 GMT 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dmitry Vasiliev wrote:
> John Arbash Meinel wrote:
>> John Arbash Meinel has voted +1 (conditional).
>> Status is now: Semi-approved
>> Comment:
>> Well, if we are going to optimize this one, I have a couple other hints.
>>
>> First, I like the dictionary.
>>
>> But appending to a string is expensive, because it has to generate a
>> new string each time. I'm not sure how this effects rand_chars(16)
>> which is our common case, but it should be a big deal for rand_chars(50).
>>
>> And then, if you are using a dictionary, you should make it a local
>> variable before the loop.
>>
>> So this is what I would do:
>>
>> def rand_chars(num):
>>   _map = _rand_chars_map
>>   return ''.join([map[b] for b in rand_bytes(num)])
>>
>> Using a list comprehension, and then doing using string.join should be
>> faster.
>>
>> I tested a list comprehension versus a generator, and 'timeit' says
>> that the list is faster. I assume this is because the string can
>> iterate once to figure out how long it needs to be, and then again to
>> merge everything together. (13us per loop versus 10s) per loop.
>>
>> So I'm happy with your version, but there are a couple more tricks we
>> can apply.
>>
>> Can you benchmark it with the alternative, to see how much it helps?
> 
> Actually I've already tested list/generator comprehension versions and
> surprisingly version which appending to a string was faster. But I
> didn't test version with the mapping as a local variable.
> 
> After I've added the local variable test results are:
> 
>  - Generator comprehension: 6577ms/6921ms
>  - List comprehension: 5608ms/5921ms
>  - New version with the loop: 5405ms/5750ms
> 
> I've attached the new version with less memory-aggressive benchmark and
> the mapping as a local variable.

Interesting results. Can you try the benchmarks with rand_chars(16)? It
seems more appropriate since that is the way we call rand_chars.

Also, we should shoot for Benchmarks that take about 1s to finish. Much
shorter than that and the noise is too much. But we also don't want to
spend forever in the benchmark suite.

Can you tune the number of loops to shoot for around 1s?

Oh, and we should add a small comment to let us know that it was
actually benchmarked with different methods, and this was the fastest.
That helps keep people from wondering if it should be done a different way.

On my machine, it is
original	8421ms
for loop	7200ms
list comp	6916ms

Now, these are with rand_chars(16).

For me, dropping the number of loops to 30,000 put the test suite about
where I would want it, time wise.

And I tried to genuine comparisons between the list comprehension and
the for loop, but basically they were well within the noise margins.

I'm generally positive about this change. Though I'm curious if you were
seeing rand_chars() being a measurable overhead. It isn't something that
we call a lot. (We call it once per invocation, regardless of the number
of files, and then another time for revision ids).

I'm happy to have fast functions, just curious what you saw here.

John
=:->

PS> If you change it to run in approx 1s, and put some comments as to
why we are using a for loop, +1 from me.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF9BubJdeBCYSNAAMRAugNAJ9qDLF59mfEu5B71kJI/3YkSPIrvgCgoWRy
1miKBE/Scfdtu2nfvbZGRKg=
=jzgh
-----END PGP SIGNATURE-----