Container repository data ordering

Thu Jul 19 07:32:33 BST 2007

On Wed, 2007-07-18 at 10:06 -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> The new container-based repositories will group data according to sets
> of revisions.  Each container will provide the data for a set.
> 
> Since repositories are read more often than they are written, it makes
> sense to optimize for read.  Ideally, we would be able to extract all
> the data we need in a single pass with no seeking.

This is a good goal to aim for. We should get some way of measuring how
close we have come to this.

> There are four main datatypes in a repository:
> - - file versions
> - - inventory data
> - - revision data
> - - signatures
> 
> For reconstruction, inventory data must come before file texts, because
> we cannot know which file versions we need until we've examined the
> inventories.
> 
> Some commands such as "log" and "missing" can use revision data without
> reference to any other data.  By contrast, almost every use case for
> inventory data includes using file versions.
> 
> Signatures can only be verified once the inventory and revision have
> been read, so they shouldn't come before the inventory or revision data.
> 
> Therefore, I propose this ordering:
> 1. revision data (to be nice to log)
> 2. inventory data
> 3. signatures
> 4. file versions
> 
> This sequence should work well for tree-wide operations like "merge",
> "revert", "status", "diff" and "checkout".  It won't provide as much
> locality-of-reference as the current format does for single-file
> commands like "cat" and "annotate".  I think that's inevitable.

Hmmm, I think it should have no worse locality of reference, and
possibly better than the current formats, once you consider disk block
allocation.

> File versions have an additional ordering problem:
> What order should building happen in?
> 
> The current knit format orders data by file, and orders versions from
> oldest to newest.

Modulo ghosts :). Filled ghosts can come in any order.

>   Deltas are intended to be applied in forward order.

> What are our requirements?
> 1. Building recent versions should be fast
> 2. We should not hold more than one fulltext in memory at a time

This is for tree-building? For merge I presume we want to get 2-3
fulltexts out at once?

I think requirement 2 above could be explained a bit more; it doesn't
seem obvious to me that that is a requirement that is a problem.

> The desire for fast recent versions motivates the suggestion that
> patching should be done backwards: to generate an old version, you get
> the fulltext of a new version and apply patches to it to turn it into
> the old one.
> 
> However, the desire to avoid holding more than one fulltext in memory
> means that once we have read a fulltext, we cannot have another fulltext
> until all the relevant patches have been read.  So if the fulltext comes
> first, this suggests that the relevant patches must come immediately
> afterward.

Agreed.

> This means that while reading recent revisions will have low CPU
> requirements, it will have higher IO requirements, because it can only
> stop reading the container once it has read (or seeked past) all the
> fulltexts it needs, and (almost) all the deltas which are based on them.

I don't see how this increased IO load follows. Perhaps an example will
help?
given revisions A:[], B:[A], C:[B], D:[A], E:[D,C]
In my head we have something like:
record | Fileid | revision | compression-basis-list | DATA |
0      |   X    |   E      | []                     | ...  |
1      |   X    |   D      | [E]                    | ...  |
2      |   X    |   C      | [E]                    | ...  |
3      |   X    |   B      | [C]                    | ...  |
4      |   X    |   A      | [D]                    | ...  |
5      |   Y    |   E      | []                     | ...  |
6      |   Y    |   D      | [E]                    | ...  |
...

That is, to build the tree E, you read record 0 and record 5 to get the
texts of file X and Y.

In terms of disk cache and head activity this is paging in 2 4K pages
(or maybe only one 4K page).

To build the tree D, you read records 0 and 1, then 5 and 6, and so on.

On average, to access an arbitrary text in a file, whether you go
new-old or old-new you page in 1 fulltext and 1/2 the delta chain length
worth of patches.

>  Reducing the size of the delta chain will reduce the amount of data
> that needs to be read for recent revisions, at the cost of increasing
> the amount of data that needs to be read for older revisions.

Reducing the delta chain will *increase* the amount of data to be read
to reconstruct recent revisions, AIUI. 

> By contrast, a forwards-patching format would allow deltas to be written
> for new versions, minimizing the number of fulltexts written.

My proposed format above allows both forward and backwards patching. One
would generate backwards patches during 'pack' and 'pull' or 'push', and
forward patches (or no patches) during commit.

>   Some
> containers could contain only deltas, therefore reducing the need for
> repository compaction.

Containers with only deltas drive repository compaction because they
drive disk seeks, which are expensive.

>   Therefore, forwards-patching reduces storage
> space and IO at the cost of increased CPU.  It also means that that
> building recent revisions will be slower than building snapshotted
> revisions, but it's not clear how much slower it would be.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070719/321792f2/attachment-0001.pgp