[RFC] store inventory in tab-separated file

Alexander Belchenko bialix at ukr.net
Mon Jan 29 03:53:54 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I wrote draft implementation of new serializer format that use tab-separated
text instead of XML. John Meinel often says that our weakness point is
inventory. So I make some experiment to rewrote our serializer.

After converting current bzr.dev inventory to new format I have reducing
in size from 73619 to 56476 bytes, i.e. 23%. I expect that on kernel tree
this effect will be much bigger. I also expect that inventory.knit
in repository also will be reduced.

I'm also make a one step towards implementing versioned properties
(http://bazaar-vcs.org/VersionedProperties).

In attachment you can find my module tabseparated.py with new serializer,
test_tans.py script to manual testing and inventory.xls -- bzr.dev inventory
that converted to new format. This file easily opened with OpenOffice.

I have some questions and need some guidance for next steps on this.

0) Does my work have sense to continue?
1) How to benchmark speed with using new inventory format? I expect it shoud be
faster but I can't predict real value.
2) Why we are using 2 similar formats v5 and v6? Why for working inventory
used v5 -- for speed-up reasons? Does I need implement v7 and v8 formats,
or I need one rich format a-la v6?
3) I don't understand how versioned properties should be extended? Does I need
simply throw away unrecognized properties? Can I add to specific InventoryEntry
classes (InventoryDirectory, InventoryFile, etc) some support for packing/unpacking
of versioned properties? Specification http://bazaar-vcs.org/VersionedProperties
says that "Inventory and InventoryEntry will get proplist attributes, that will hold the
properties". Does it means that we need shine new inventory2.py file with new
implementation?
4) What tests I need to write for new serializer?
5) How to write converter for upgrade?

- --
Alexander
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFvW/SzYr338mxwCURAgGkAJ4ppL9qATbx9HufJfJReI0XBk76EwCfUv8d
QU/pxGATk0E0JVTv+96zKvM=
=ktkc
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_tabs.py
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20070129/441422ee/attachment-0003.diff 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: inventory.xls
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20070129/441422ee/attachment-0004.diff 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: tabseparated.py
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20070129/441422ee/attachment-0005.diff 


More information about the bazaar mailing list