[PACKS] Format change to pack-names has been made

Robert Collins robertc at robertcollins.net
Tue Aug 28 04:15:58 BST 2007


The change I previously announce has been made; the pack-names index now
contains the length of the indice files. I'm recapping the change below,
and I've got updated dogfooding instructions to work around the current
data issue in bzr.dev.

This change is pretty minimal, the pack-names index is going to change
from having no content in the node values to having the ascii
represented length of each index, space separated.

E.g.
1997a0a93f6b497b1d039a4d2a5cfc8b^@^@^@
will become
1997a0a93f6b497b1d039a4d2a5cfc8b^@^@^@196 200 290 131

This is easy enough to do by hand - ls -l the indices subdirectory and
put the sizes into the pack-names index using vim or some other editor
not confused by \x00 bytes. The order is the sizes of
the .rix, .iix, .tix, .six files. If you do a 'bzr pack' before pulling
the format change revision, you will only have 2 packs to adjust the
details for, which will make it easier for you.

These sizes will be used to seed bisection and reduce the amount of data
read over the network for push and pull. (Though the .rix will still be
fully read until further changes are done to bzr's push/pull logic).

Pack repositories
-----------------

Pack repositories offer some significant benefits over knit repositories
for dumb protocol access. The key benefit is the ability to cap the
total latency vs knits where each separate file causes additional round
trips. We also reduce the total VFS operations required - we no longer
need to append to files, and that makes both SFTP and FTP support
easier.

Status
------

Pack repositories are an experimental format aimed at 0.92. They
currently work fully and are undergoing performance tuning - part of
which is in the repository format, and part of which is the use of the
repository from the rest of bzr - working through the size(history)
operations we do and removing them.

Martin is working on the inventory layer, with the goal of integrating a
more scalable inventory into the pack format. The two things in
combination - the performance tuning, and the plans to change
representation, mean that the disk format for packs is subject to change
- all such changes will be announced here with instructions for
migrating test repositories.

Dogfooding
----------

In order to find rough edges, I'm using the pack format full time myself
at the moment. There is a copy in knit format at 
http://people.ubuntu.com/~robertc/pack-repository.knits. This is kept
reasonably up to date, but the very latest code is only available via
the pack-based format branch at 
http://people.ubuntu.com/~robertc/baz2.0/repository. I update the knits
copy when format-related changes occur, but not for e.g. performance
tweaks.

When dogfooding, be aware that folk who are not also dogfooding will not
be able to read pack based repositories. You may need to keep some
branches in knit format to interoperate. (I have kept my integration
branch like that :)).

To dogfood:
 - pull a copy of the pack supporting branch - 'bzr branch 
http://people.ubuntu.com/~robertc/pack-repository.knits packs.knits'
 - Then you can create a branch or repository in experimental format:
   '$ packs.packs/bzr init --experimental my-pack-branch'
   or
   '$ packs.packs/bzr init-repo --experimental my-pack-repo'
 - Now just pull in content from a knits branch and bzr will convert the
data on the fly:
   - for a branch: 'cd my-pack-branch && ../packs.packs/bzr pull URL'
   - for a repo 'packs.packs/bzr branch URL my-pack-repo/PATH'

NOTE: The bzr.dev repository has an knit delta/index mismatch with the
inventory content pointer that causes pull to not grab enough data; this
is currently being addressed in bzr.dev. So don't test packs on bzr
itself unless you seed your copy by wgetting the contents of
http://people.ubuntu.com/~robertc/baz2.0/.bzr/repository 

I have put the packs.knits directory on my path, so that I use that bzr
all the time, but this isn't a requirement.

I will announce all changes to the format with a [PACKS] email to the
list, so you can take whatever action is needed to upgrade. In general
this will be straight forward - such as the change in this alteration.
The most dramatic changes may require an export of the repository
content to a knit repository, then creation of a new repository - but
full instructions will be included if/when that becomes necessary.

Known issues
------------

At the moment remote access will read the entire text index rather than
just the needed contents; this can now be fixed by using the additional
data made available by the format change described at the start of this
email.


-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070828/dfa933c9/attachment.pgp 


More information about the bazaar mailing list