[MERGE][RFC] further add performance improvements

John A Meinel john at arbash-meinel.com
Wed May 24 02:57:00 BST 2006


John A Meinel wrote:
> Martin Pool wrote:
>> On 22 May 2006, Robert Collins <robertc at robertcollins.net> wrote:
>>
>>>> I would be curious to see if switching to RIO would gain us more of a
>>>> speed improvement than using XML. Though I'm not confident, since we are
>>>> using a compiled XML parser, versus a potential python parser (for RIO).
>>> Martin indicated to me that RIO was up to twice as fast as celementree.
>> That was the case when I benchmarked writing inventories a while ago.
>> I had wondered if building the tree and then writing it through
>> cElementTree was slowing things down and it's interesting to hear that
>> it's not.
>>
>> I might update that and put it into subclasses of the new benchmark
>> suite.
>>
> 
> Well, I don't know what rio format you were choosing to use, but in my
> test, I found that rio is slower than cElementTree at reading, but can
> be faster at writing.
> 
> I have a plugin available from here:
> http://bzr.arbash-meinel.com/plugins/rio_inventory

...

> I'm also going to be looking into the effect on Knits of the new format.
> 

I tested creating a Knit with my rio inventory format. With the new
format, these are the basic results (this involves re-adding all of my
inventories in my bzr repository):

processed 5619 inventories

extracted in 26.71s (4.8ms avg)
Times   read (s)        read avg (ms)   write (s)       write avg (ms)
xml:       71.82            12.8          270.07            48.1
rio:      252.53            44.9          117.99            21.0
ratio:    3.5160                          0.4369          1.0837
Times   add (s) extract(s)
xml:      98.38    18.55
rio:     272.68    33.37
ratio:     2.77     1.80

sizes:  xml.knit  rio.knit  ratio
         6880558   5693021  0.827

The add and extract times are so much longer because knit (and weave)
are line oriented, and now every attribute is another line, rather than
being more information on the same line.

But one very nice this is the 82.7% extra compression.

I'm going to look into a format like my sax work, to see about a 2-line
inventory style.

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060523/8d281ce8/attachment.pgp 


More information about the bazaar mailing list