[MERGE] simple performance improvement

Robert Collins robertc at robertcollins.net
Wed Jan 31 17:54:16 GMT 2007


On Wed, 2007-01-31 at 11:49 -0600, John Arbash Meinel wrote:
> It seems that the construct:
> 
> for segment in contents:
>   f.write(segment)
> 
> is more expensive than just calling
> f.writelines(segment)
>
> The --lsprof difference is:
>     484 3.080 2.255 bzrlib.transform:253(create_file)
> +147651 0.511 0.511 +<method 'write' of 'file' objects>
>    +484 0.276 0.276 +<method 'close' of 'file' objects>
> 
> versus
> 
>     484 2.486 2.327 bzrlib.transform:253(create_file)
>    +484 0.078 0.078 +<method 'close' of 'file' objects>
>    +484 0.042 0.042 +<method 'writelines' of 'file' objects>
> 
> In a real-world test of 'bzr checkout --lightweight bzr.dev test' it
> changes the cpu time from 4.29 (+-0.05) down to 4.21 (+-0.07).
> 
> This isn't revolutionary, but it is an improvement, and the change to
> the code is minor. So I'd like to get it merged.

+1.

> I also think the improvement will scale with larger datasets. And it
> helps keep 'write()' from looking like a performance issue.
> 
> Counter intuitively, I have evidence that changing the line:
> 
> f = open(name, 'wb')
> 
> to
> 
> def do_open():
>   return open(name, 'wb')
> f = do_open()
> 
> Actually shaves off another 20ms. (down to 4.19) This is averaged over
> 15 runs. So while it isn't a huge dataset, it isn't like I just ran it a
> couple times.
> 
> I don't really know how to respond to that, I only did it because
> --lsprof doesn't show the time spent in open(), but it does show the
> time spent in a nested function like do_open().

It may well be because do_open is local, so is bound in the local
variables, whereas open is a global, so has to be looked up. If we do
a /lot/ of opens that could matter.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070201/e0c48e87/attachment-0001.pgp 


More information about the bazaar mailing list