pack based repository design sketch and open questions
Robert Collins
robertc at robertcollins.net
Tue Jul 31 08:11:03 BST 2007
I thought I'd draft this mail as prototype docs for the pack repository
- questions are embedded in the text :)
Pack based repositories
=======================
Pack based repositories use a collection of bzrlib containers and
indices as their primary database store. They were designed to provide a
good tradeoff between complexity, data size, concurrency and data-writes
required during commit/push.
Outline
-------
Each repository has a collection of containers. Each container has a
number of associated indices. Currently the indices are core to the
repository, but future iterations should make the containers the sole
provider of authoritative data, with the indices generated on demand,
allowing for smaller network transfers.
Every commit/fetch operation will add a new container. The pack
operation will combine existing containers to reduce the number of
containers that need to be examined when reading data.
This makes commit write approximately 5 files on every operation,
allowing much lower disk seeking that a file-per-user-file design, but
requiring regular packs to prevent performance degrading.
Files
-----
PREFIX values are allocated when data is inserted. To allow access over
regular http and other unlistable transports there is a canonical list
of PREFIX values stored in .bzr/repository/pack-index
indices are located in .bzr/repository/indices. They take the form
PREFIX.SUFFIX. Valid SUFFIX values:
* .tix - text index
* .iix - inventory index
* .rix - revision index
* .six - signature index
packs are located in .bzr/repository/packs. They are named
PREFIX.pack.
Prefix generation
-----------------
While we can go to guid or sha based naming, there does not seem to be a
need for this while we maintain an index, as safe index maintenance will
continue to require a write lock (though the length of the lock can be
reduced to that required to rename-into-place temporary files). OTOH
using a sha which can be determined once the pack temporary file is
written may provide a useful cross check. For now, the simple allocation
scheme of '0', '1', '2' etc is sufficient.
Permissions
-----------
We have had problems with permissions preventing users sharing
repositories in the past. Generally this has occurred because we desired
to rewrite or append to files created by different users. With the pack
based repository the only file for which this applies is the pack index,
so for simplicity we could stop chmodding the other files (and my
prototype does this because it was the most easy way to glue things
together). That said, I don't chmod the pack index file yet, which is a
bug.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070731/fc0b5663/attachment.pgp
More information about the bazaar
mailing list