newformat format change

John A Meinel john at arbash-meinel.com
Wed Oct 5 16:44:08 BST 2005


John A Meinel wrote:
...

>
> Well, I played around with it, you can see my branch here:
> http://bzr.arbash-meinel.com/branches/bzr/sax/
>
> I only really modified the code which handles inventories, since the
> Revision XML is pretty straightforward.
>
> The specific changeset is also attached.
>
> In my testing, it didn't seem to change the processing speed by a whole
> lot. (The speed of bzr selftest was only about 1s faster).
>
> I'm still testing stuff like upgrade.

Well, because of being able to control the xml directly, I was able to
insert more newlines into inventory and revision xml. Which makes them
weave friendlier, as well as editor friendlier (much shorter lines)

This actually makes "bzr upgrade" take longer (I assume it has more
lines to compare). However, the final inventory.weave file is much smaller.

With bzr.newformat and 1854 revisions, inventory.weave is ~2.6M.
With my line-based version, inventory.weave is ~1.4M

So almost 1/2 the size.

Which to me, says it is probably worth it, because it means you have to
dowload 1/2 the data from a remote branch.

But as near as I can tell, I didn't really improve the *speed* over the
original.

However, I'm just using cElementTree.iterparse for the parsing of
inventories, which still means it generates an Element, I just flush out
the data before it collects everything.

It might be even faster if we dropped down to xml.parsers.expat, which
is what ElementTree uses.

John
=:->

>
> John
> =:->
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051005/734d6104/attachment.pgp 


More information about the bazaar mailing list