[RFC] bzr.jrydberg.versionedfile

Wed Dec 21 16:29:01 GMT 2005

John Arbash Meinel <john at arbash-meinel.com> writes:

>> Far from optimal, but uses the defined APIs.
>
> What would you consider optimal, and how different would it be to get us
> there? I don't think we are stuck on any specific API, we won't reached
> 'stable' until February. :) Far better to do the right thing now, then
> be hackish.

I would consider the current implementation optimal in the sense that
it does not have to compare any inventories to find out what file
versions to pull.  There are a few small implementation problems of
course, but those can be fixed quite easily.

Regarding the API: No, it is not written in stone.  But I have defined
a API that I am quite fond of.  I'm talking about the VersionedFile,
VersionedFileStore and RevisionStore classes.  Using these I think we
can implement almost any history format.  They need a little bit more
love to be complete, esp the store classes.

>>>	1) Grab a list of revisions
>>>	2) Figure out the set of files involved. This is either done by
>>>	   reading inventories, or with your delta object.
>>>	3) For each file, either:
>>>		a) Pull in only changes which match the list of
>>>		   revisions you are expecting to fetch
>>>		b) Pull in everything, because usually the waste
>>>		   will be very small (usually none)
>>>	4) Fetch the text of the inventory, and check all of the
>>>	   associated texts, to make sure they have what you need
>>>	5) Commit this inventory, then commit the revision
>>>	6) Go back to 2 for the next inventory.
>
> I don't see any specific problems. I think it is pretty much what I was
> suggesting. You can do everything direct to disk, you just have to do it
> in the right order. Changes can include revisions which aren't fully
> added yet, since it isn't part of the contract.

Yes, with the exception of steps 5 and 6.

> One alternative, would be to have a WAL of sorts, which is just a list
> of revision-ids which have been committed to the store. So the
> transaction id then becomes the revision id (which is really what we are
> doing right now, we are just using the revision-store as the WAL).

Sorry, but what does WAL mean? 

I had an idea some time ago of having a 'revision-graph' file in .bzr
that contains (revision-id, parents) tuples of all revisions available
in the revision-store.  I think that using such a file is cleaner
design wise, than to rely on the index of the inventory or revision
knits to extract ancestry and graph information about the branch --
esp in the case where the inventory and revision knits are shared
between several branches.

~j