compiled write_inventory 3x faster
John Arbash Meinel
john at arbash-meinel.com
Wed May 31 06:15:14 BST 2006
I decided to continue playing around with my compiled bzr extensions. We
may not have to give up on XML as a file format just yet.
It turns out that elementtree.ElementTree._write is not very fast, in
fact, it is quite slow.
And it also so happens that cElementTree doesn't override the _write
function, just the read stuff.
This is what I found while testing. First, stock bzr.dev (--no-plugins)
.test_one_add_kernel_like_tree OK 9081ms/17161ms
10825 10824 2116.8260 1016.3810
elementtree.ElementTree:662(_write)
1 0 3243.2560 0.0160
+bzrlib.xml_serializer:42(write_inventory)
So that shows that the serializer is spending something like 3.2s just
writing out the inventory, and ElementTree._write is responsible for a
large portion of that.
So the first thing I wrote, was just a custom write function, which
takes an Element object, and writes it to a file. I still used
pack_inventory and unpack_inventory, I just customized the actual XML
serialization. That gave me this performance:
.test_one_add_kernel_like_tree OK 8277ms/16244ms
1 0 791.1660 791.1660
bzrlib.plugins.rio_inventory:28(_write_element)
+1 0 1898.7990 0.0210
+bzrlib.xml_serializer:42(write_inventory)
(Don't be fooled by rio_inventory, this is testing XML writing speed).
Well, this looks like we were able to drop by as much as 2s=>800ms,
which is pretty good.
Then I realized we have a whole python function which is building up a
custom object, just to write it to a file. So I wrote my own C++
_write_inventory function. So now there is no intermediate step. With
that change, I get:
.test_one_add_kernel_like_tree OK 7657ms/16251ms
1 0 1111.6240 414.5960
bzrlib.plugins.bzr_extensions:23(write_inventory)
Now, lsprof seems a little confused, saying that 400ms is spent inside
the function, and 1s is spent in total, but there is only one line,
which calls the C++ code.
Regardless, the total time spent it write_inventory is now down from
3.2s down to 1.1s.
And the time for just 'bzr add'ing the kernel like tree has dropped 2.5s.
At this point, the #1 most expensive thing is still the 'is_ignored'
call (clocking in at a total of 1.2s spent there).
These are the times without --lsprof-timed:
.test_one_add_kernel_like_tree OK 4788ms/13088ms
vs
.test_one_add_kernel_like_tree OK 3703ms/11701ms
There is a small problem that --lsprof-timed has a larger impact on
python code than C++ code, so performance improvements have to be done
without --lsprof to really verify them. But it still looks like I was
able to shave off almost 25% of the 'bzr add' time.
It also should be very possible to write a custom Element writer, which
doesn't have quite as much overhead as ElementTree.write().
I'll look into that next.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060531/38d70cdb/attachment.pgp
More information about the bazaar
mailing list