[MERGE] simple performance improvement
John Arbash Meinel
john at arbash-meinel.com
Wed Jan 31 17:49:47 GMT 2007
It seems that the construct:
for segment in contents:
f.write(segment)
is more expensive than just calling
f.writelines(segment)
The --lsprof difference is:
484 3.080 2.255 bzrlib.transform:253(create_file)
+147651 0.511 0.511 +<method 'write' of 'file' objects>
+484 0.276 0.276 +<method 'close' of 'file' objects>
versus
484 2.486 2.327 bzrlib.transform:253(create_file)
+484 0.078 0.078 +<method 'close' of 'file' objects>
+484 0.042 0.042 +<method 'writelines' of 'file' objects>
In a real-world test of 'bzr checkout --lightweight bzr.dev test' it
changes the cpu time from 4.29 (+-0.05) down to 4.21 (+-0.07).
This isn't revolutionary, but it is an improvement, and the change to
the code is minor. So I'd like to get it merged.
I also think the improvement will scale with larger datasets. And it
helps keep 'write()' from looking like a performance issue.
Counter intuitively, I have evidence that changing the line:
f = open(name, 'wb')
to
def do_open():
return open(name, 'wb')
f = do_open()
Actually shaves off another 20ms. (down to 4.19) This is averaged over
15 runs. So while it isn't a huge dataset, it isn't like I just ran it a
couple times.
I don't really know how to respond to that, I only did it because
--lsprof doesn't show the time spent in open(), but it does show the
time spent in a nested function like do_open().
John
=:->
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: use_writelines.patch
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20070131/3f3ee08c/attachment-0001.diff
More information about the bazaar
mailing list