Split revfiles?

John Arbash Meinel john at arbash-meinel.com
Mon Jun 20 16:29:47 BST 2005

Martin Pool wrote:

>On 20 Jun 2005, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
>>Hi all,
>>I was just thinking about revfiles (the planned efficient storage
>>format) and remote access.
>>If we only support protocols that allow us to retrieve particular
>>sections of a file, then efficient downloading won't be a problem.  But
>>since revfiles have full texts every x revisions (where x is currently
>>25), it may make sense to start a new revfile at that point.  That way,
>>each sub-file is self-sufficient for the revisions it contains, and we
>>never need to download the whole revision history when we don't need it.
>That's an interesting idea.  That would seem to imply some kind of
>two-level indexing to find a particular text: either the inventory
>says [which-revfile, which-text], or there is an index-of-revfiles.
Well, you could keep the current inventory system, where each file also
has an associated "text_id", and that text id tells you which pair of
index-file/revfile you are looking at.

You could also keep the rev-file in the index, but I'm guessing that
wouldn't work really well at keeping the index file chunks small and
fixed in size.
I suppose you could use either a sequential number (32-bit = 4byte), or
something like a UUID (16-bytes). And then the associated revfile would be:

As long as you could store some identifier in the index file, and then
have an obvious way of converting that into a path.

The tricky part as I see it, is figuring out what file to append to,
when you are adding a new revision. Right now, you need to lock the
revfile, append to it, fsync it, unlock revfile, lock index file, append
to it, fsync, unlock, fill out inventory, close, fill out revision,
close, atomic append to revision_history.

With multiple possible revfiles, one process might append to the
original revfile, another might think that a new file needs to be
created. I don't know that it really hurts anything, but something to
think about.


PS> I think if you want to split revfiles, that using a sequential
number is a decent method.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050620/26ce5744/attachment.pgp 

More information about the bazaar mailing list