0.6 release plan

Thu Oct 27 16:27:18 BST 2005

On Wed, 26 Oct 2005, John A Meinel wrote:

> Well, I have to say that you are correct that python's gzip isn't as
> fast as the native gzip. Specifically, I tested this:
>
> import time, os
> import gzip
> def t():
>  tstart = time.time()
>  f = gzip.open('inventory.weave.gz')
>  f.read()
>  f.close()
>  tend = time.time()
>  print tend-tstart
>
> t() # 0.15
>
> So the gzip.open() and reading everything takes 0.15s (this is my slow
> machine)
>
> On the other hand, we have this:
> $ time zcat inventory.weave.gz > /dev/null
>
> real    0m0.101s
> user    0m0.088s
> sys     0m0.028s
>
> So the python's gzip is 50% slower than the compiled version. And if I
> try to read 1024*1024 chunks at a time, it slows down to 0.17s.
>
> On my fast machine (running windows), this changes a little bit:
> $ time zcat inventory.weave.gz > /dev/null
>
> real    0m0.069s
> user    0m0.077s
> sys     0m0.030s
>
> t() # 0.036
>
> Which actually means that zcat is slower than python. On the other side,
> though "time : | cat" takes 0.07s, so probably the difference is just
> the windows process spawn time. ("time /usr/bin/echo" takes 0.04s)

I wouldn't trust these numbers, for two reasons.  First, they are much too 
small;  less than 0.1s is at the level of noise.  Second, you're comparing 
apples with oranges -- what are the relative overheads of starting zcat 
vs. Python doing its additional stuff?  How much of the time in each case 
is spent actually decompressing?

Nick