compiled write_inventory 3x faster

John Arbash Meinel john at arbash-meinel.com
Wed May 31 06:15:14 BST 2006


I decided to continue playing around with my compiled bzr extensions. We
may not have to give up on XML as a file format just yet.
It turns out that elementtree.ElementTree._write is not very fast, in
fact, it is quite slow.
And it also so happens that cElementTree doesn't override the _write
function, just the read stuff.

This is what I found while testing. First, stock bzr.dev (--no-plugins)
.test_one_add_kernel_like_tree               OK  9081ms/17161ms
10825  10824  2116.8260  1016.3810
	elementtree.ElementTree:662(_write)
    1      0  3243.2560     0.0160
	+bzrlib.xml_serializer:42(write_inventory)

So that shows that the serializer is spending something like 3.2s just
writing out the inventory, and ElementTree._write is responsible for a
large portion of that.

So the first thing I wrote, was just a custom write function, which
takes an Element object, and writes it to a file. I still used
pack_inventory and unpack_inventory, I just customized the actual XML
serialization. That gave me this performance:

.test_one_add_kernel_like_tree               OK  8277ms/16244ms
    1   0  791.1660  791.1660
	bzrlib.plugins.rio_inventory:28(_write_element)
   +1   0 1898.7990    0.0210
	+bzrlib.xml_serializer:42(write_inventory)

(Don't be fooled by rio_inventory, this is testing XML writing speed).

Well, this looks like we were able to drop by as much as 2s=>800ms,
which is pretty good.
Then I realized we have a whole python function which is building up a
custom object, just to write it to a file. So I wrote my own C++
_write_inventory function. So now there is no intermediate step. With
that change, I get:

.test_one_add_kernel_like_tree               OK  7657ms/16251ms
    1   0 1111.6240  414.5960
	bzrlib.plugins.bzr_extensions:23(write_inventory)

Now, lsprof seems a little confused, saying that 400ms is spent inside
the function, and 1s is spent in total, but there is only one line,
which calls the C++ code.
Regardless, the total time spent it write_inventory is now down from
3.2s down to 1.1s.
And the time for just 'bzr add'ing the kernel like tree has dropped 2.5s.

At this point, the #1 most expensive thing is still the 'is_ignored'
call (clocking in at a total of 1.2s spent there).

These are the times without --lsprof-timed:
.test_one_add_kernel_like_tree               OK  4788ms/13088ms
vs
.test_one_add_kernel_like_tree               OK  3703ms/11701ms

There is a small problem that --lsprof-timed has a larger impact on
python code than C++ code, so performance improvements have to be done
without --lsprof to really verify them. But it still looks like I was
able to shave off almost 25% of the 'bzr add' time.

It also should be very possible to write a custom Element writer, which
doesn't have quite as much overhead as ElementTree.write().

I'll look into that next.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060531/38d70cdb/attachment.pgp 


More information about the bazaar mailing list