Semi-stable disk format for brisbane-core?

John Arbash Meinel john at arbash-meinel.com
Mon Mar 16 00:51:50 GMT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Clatworthy wrote:
> John/Robert,
> 
> Thanks for all the great work done last week in the sprint.
> As always, it was great to catch up with and learn so much
> from you guys.
> 
> Can you please outline what decisions we need to still make
> before we can make one of the brisbane-core disk formats
> stable-ish? I'll like to kick off some *large* conversions
> soon (e.g. OOo) and I'd prefer to start that after you think
> we're approaching something semi-stable.
> 
> Given the testing done and progress made last week, I'm
> strongly leaning towards chk255-big without labels (for
> chk inventories at least). With Aaron's fix to make
> tree-references something that must be explicitly asked for,
> I'm guessing we want the subtree serialiser used instead of
> the rich-root one as well?
> 
> Anything else?
> 
> Ian C.
> 

I have at least one patch to the group serialization as I want to add
the uncompressed size as easy to know up-front. (I have a patch
already). I can certainly write it so that we can read either format
(right now 'bzr pack' can convert between with/without labels, and lzma
or zlib, I could certainly allow that one more...)

I agree that we want to do subtree+rich-root as soon as Aaron's changes
get merged back into brisbane-core.

As for layouts... I'm still thinking to try a 16+255-way (maybe big
page, maybe not...) or some similar variant.

I'm almost done with the byte-stream fetch (worked on it in the
airplane), which is part of what stimulated a desire to add some more
numbers to the serialized form. (When fetching, you can tell how much of
the stream you will be sending without decompressing by checking the
byte ranges, etc.)

Another possible change to the byte-stream would be to remove the "size
of source" as the first part of the delta. It isn't huge, but probably
something like 3 bytes per delta record, which wouldn't likely compress
well (3 bytes out of 40 is becoming a non-trivial amount).

I can't think of something else I want to change about the bytes on disk.

As mentioned we could make things compatible if we need to. At a
minimum, converting from one CHK format to another should be a lot
faster than converting from an XML format. Though perhaps converting
from a fast-import stream would be just as good.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkm9oqUACgkQJdeBCYSNAANoFwCfTPhCbNRLrjayFtp1qJPY+LIx
JeQAnjPuTvcr1SHU876MSbMWsUk41u5V
=SoQx
-----END PGP SIGNATURE-----



More information about the bazaar mailing list