newformat format change

Gustavo Niemeyer gustavo at niemeyer.net
Fri Sep 30 21:15:13 BST 2005


> RFC822 formatting doesn't group data that way.  Or rather, the
> .rsyncexclude data would be the message body.

Sorry. I should have said rfc822-styled.

> That looks more like ini format:
> [bzrignore-20050311232317-81f7b71efa2db11a]
> Kind="file"
> Name=".bzrignore"
[...]

Yes, that's another way to do it.

> > Does that look complex?
> 
> No, but you've omitted a lot of inventory data.  Most crucially, you
> haven't specified how parent directories are indicated, or how to
> specify additional metadata.

To make the discussion more concrete I've actually implemented
the textual serializer. To enable it just append the line

  from bzrlib.textserializer import serializer as serializer_v5

at the end of xml5.py, and drop the attached textserializer.py
in bzrlib.

WARNING: That's for testing purposes, and may destroy any real
         repository. :-)

Please, bear in mind that this is not a merge suggestion or
anything. It's just a proof of concept to be able to discuss
the issue with more information, and avoid saying just
"I belive". :-)

One of the interesting points about the results is that the
*uncompressed* size of the new textual revision storage is
smaller than the current *compressed* size:

XML revision storage of bzr, compressed:

  % ls *.gz | wc -l
  1807
  % du -shc --apparent *.gz | tail -1
  655K    total

Textual revision storage of bzr, uncompressed:

  % ls * | wc -l
  1807
  % du -shc --apparent * | tail -1
  549K    total

I've also converted the inventory weave, also with positive results:

% ls -l old-inventory.weave new-inventory.weave
-rw-r--r--  1 niemeyer niemeyer 2400236 2005-09-30 16:46 new-inventory.weave
-rw-r--r--  1 niemeyer niemeyer 5614525 2005-09-30 16:28 old-inventory.weave

And also compressed:

% ls -l old-inventory.weave.gz new-inventory.weave.gz
-rw-r--r--  1 niemeyer niemeyer  324199 2005-09-30 16:46 new-inventory.weave.gz
-rw-r--r--  1 niemeyer niemeyer 1089354 2005-09-30 16:28 old-inventory.weave.gz

> Properties look like an obvious example to me.  You might also want to
> check out http://bazaar.canonical.com/InventoryFormatV2 for an example
> of nested inventories.

I'll have a look. Anyway, I don't think it'd be hard to come up
with a similar solution for properties.

> Of course, nothing's *necessary*.  We could write an alternative format.
>  XML was picked mainly because of the broad tool support, and because it
> solved issues like nesting and encoding.

Perfectly understood. I don't think it's *wrong* either. I'd just like
to discuss alternatives, to be sure that this is really the way to go.

-- 
Gustavo Niemeyer
http://niemeyer.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: textserializer.py
Type: text/x-python
Size: 6135 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050930/bb298688/attachment.py 


More information about the bazaar mailing list