[RFC] two-phase version add?

Aaron Bentley aaron.bentley at utoronto.ca
Mon Jun 25 16:57:33 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We've been able to take advantage of the reference dependencies in our
storage to avoid having half-added data being a problem.

File entries are irrelevant unless there's an inventory that points at
them.  Inventory entries are irrelevant unless there's a revision that
points at them.

So when we write, we write in this order:
1. files
2. inventories
3. revisions

This ordering ensures correctness, because nothing becomes visible until
the revisions are added.  So if we are interrupted, nothing is visible.

But when we read, we must read the inventories before we can read the files.

For fetch, we do both reads and writes.  This means that we have to read
the inventory, then read the files, then read the inventory again.  Now,
we could read the inventory to a temp file.  That would at least avoid
server round-trips.  But it seems inelegant and a bit inefficient.

It would be nice if we could first copy the inventory into local storage
(e.g.. inventory.knit), then read the files into local storage, then
mark the inventory active* (e.g. by updating inventory.kndx).

That would also be nice for bundle files.  Version 4 is specifically
designed to be installed, so it contains the entries in write order.
That makes it an inappropriate choice to behave as a repository.  Read
order makes much more sense there, and would allow us to, say, generate
a revision tree by streaming through the file.

With bundles being in read order, we have to seek.  But the bundles are
bzip2-compressed, which hampers seeking backwards.  So in order to
construct a revision tree, we have to stream through the file at least
twice-- once to get the inventory, and once to get the file diffs.

I think it would be preferable to implement two-phase version adds, so
that we could write the revision and inventory at the beginning of the
bundle, then the files, then activate the inventory, then activate the
revision.

Aaron

* The term here would usually be "commit", as in "commit a transaction",
but I thought that would be confusing to use.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGf+Xt0F+nu1YWqI0RAsxAAJ42wEn2JobRN9LXOjyZuDM4zbdrVQCfWeOf
0EaHcp4j9t/8NwapZfPjgBw=
=AdYR
-----END PGP SIGNATURE-----



More information about the bazaar mailing list