Revfile vs Atomicity & Dumbfs

Martin Pool mbp at sourcefrog.net
Tue May 10 00:32:56 BST 2005


On  9 May 2005, John A Meinel <john at arbash-meinel.com> wrote:

> Right now, I think you are just keeping a complete copy of each revision
> of a file, which you obviously don't want to do over time. The current
> suggestion is to use the "revfile" method, which has an append-only
> index and an append-only text store.

Correct.

> The thing is, append-only isn't very transaction safe, it's certainly
> better than write anywhere, but new-file only works better with backups,
> and atomicity. 

It rather depends what you mean by "transaction safe" and "atomic".
Are you talking about isolation between concurrent processes, or
recovery in case of a program or machine crash, or something else?

New-files are not sufficient for perfect recovery in case of a machine
crash.  Few filesystems guarantee that later-created files will be
written before earlier ones.  I think people who are worried about
this should probably backend their storage on a database with
writeahead logging.

> And unless I'm mistaken, it is easier to add a new file
> to a remote connection, than it is to append to an existing one (at
> least with sftp/ftp, webdav may be different).

sftp can append to a file.  I don't know if this works reliably on all
implementations.

> I was pretty concerned with the "bzr fix" command, which says that if
> you get your tree borked, you run it to create a new directory that has
> as much as it can save, rather than fixing in place.

The reason is that it's better not to have recovery processes write
over the data they're trying to recover, in case they get it wrong.
If the recovery is satisfactory, you can remove the damaged data and
put the recovered branch in its place.  But it isn't written yet, it's
only a proposal, and perhaps it would be OK to just suggest people
make a backup first.

> Why not instead of having an append-only text-store, have a directory
> where you insert new items.

This is the heart of it.  

The minimum we can assume from a dumb protocol is that it will allow
uploading and downloading whole files.  If we want to avoid uploading
or downloading a lot of redundant data we want to have small granular
files.  However, these protocols also have overhead per file, as do
local filesystems, so we want to not have too many files.  It's a
tradeoff.  I don't think revfiles are a worse tradeoff than other options.

> I also thought about atomicity, and I thought about two basic methods,
> WAL (write-ahead logging), and clone and replace.

Again, what do you mean by 'atomicity'?

> Basically, wal would
> be something like .bzr-transaction-log which would include what has
> occurred with the tree, when the final commit occurs, the file could be
> deleted. Clone-and-replace is basically, copy everything from .bzr to
> .bzr-new, make modifications to .bzr-new, and then
> rm -rf .bzr
> mv .bzr-new .bzr
> 
> The clone-and-replace would be really fast on a system that supports
> hard-links. You can hardlink, and then copy-on-write. Which allows the
> amount of IO to be roughly O(changes) rather than O(repository).

Um, no.  You'd need to make a hardlink for every file in the
repository, which takes some time.

The hard part of atomicity is nothing to do with the database but
rather in updating the working directory, in an update or merge
operation.  If this is interrupted the working directory might be left
in a very confused state with respect to the metadata.

The current code does atomic commits by replacing the revision-history
file.  This replacement is done atomically on Unix, and
nearly-atomically on Windows (and could be better there).  If it does
not happen the transaction leaves behind some objects which are not
linked in, but there are no other side effects.

-- 
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050510/7f525404/attachment.pgp 


More information about the bazaar mailing list