[MERGE][RFC] further add performance improvements

John A Meinel john at arbash-meinel.com
Sat May 20 14:33:50 BST 2006


Robert Collins wrote:
> On Fri, 2006-05-19 at 09:22 -0500, John Arbash Meinel wrote:

...

>>>
>> I think we could have 'add' always add the files to the inventory. We
>> just don't have to write the inventory to the disk when we are done.
> 
> That was my example above - not calling write_inventory. Sounds like you
> agreed.
> 
> Do I remember you doing some profiling on inventory writing performance
> at some point ?
> 
> Rob
> 

I did profiling on using cElementTree versus a manual to_xml converter.
And I found that we didn't do a whole lot better/worse (I believe
cElementTree uses ElementTree's python implementation for serializing to
a string).
The biggest thing that it let us do, was customize how we wrote the XML,
so that it would play much nicer with Weaves. I could cut down the
inventory.weave size drastically, like at least to 1/2.

We probably could get a similar improvement in knit sizes, though I
would rather see us break up the inventory into per-directory stuff first.

I would be curious to see if switching to RIO would gain us more of a
speed improvement than using XML. Though I'm not confident, since we are
using a compiled XML parser, versus a potential python parser (for RIO).

We may want a way to write a delta-inventory. Since with a large tree
you have to read the whole inventory, add a single line, and write out
the whole thing again. But if we broke it into per-directory, that could
probably be improved a lot.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060520/6d835bb5/attachment.pgp 


More information about the bazaar mailing list