lock free dirstate - prototype

Tue Sep 29 23:42:52 BST 2009

On Tue, 2009-09-29 at 14:25 +0100, Martin Pool wrote:

> Creating and manipulating directories is reportedly pretty slow on
> Windows (and maybe other places) so it would be worth avoiding any
> unnecessary lockdir-type io.  I'd rather look at taking any useful
> features from lockdir into the core format of this.  For instance,
> it's probably not safe to rely on any particular behaviour for
> rename('newfile', 'file') but you can do a directory-renaming dance.

Yup, thats why I haven't so far. 

> However, presumably we do need a semantic wt lock - how would that fit in?

The semantic lock is still needed, and for CLI use we aim for one
lock-unlock regardless. We can't reuse that for stat cache updates
trivially [layers] but we could look at taking that separate lock out as
a way to do locking on the dirstate contents when doing mutations. That
would completely preclude mutating when someone has a write lock [not a
bad thing :P] but would also lead to write operations like 'commit'
being potentially blocked by 'bzr st', if we managed the lock lifetime
wrongly. Its worth thinking about : the hard work is done (refactoring
dirstate to use multiple files).

> >  - how to deal with two writers with the same new content / ABA state
> > transitions combining with slow-readers that delete what they think is
> > unlinked content, but actually has been relinked. I'm thinking that slow
> > readers need to acquire the current file ownership before they unlink
> > anything.  Two writers with the same new content is harder; one will win
> > renaming current, and one rolling back will delete the others content. I
> > think we need some randomness/uniqueness in either the filename or the
> > content. (e.g. put a pid in there).
> 
> I'm not sure I understand.  I don't think readers should unlink files
> - do they really need to.  And won't simultaneously writers be blocked
> out by the exclusive semantic lock?

Unlinking of files - on win32 a reader prevents unlinking of a file that
its reading. That means we have to unlink at some arbitrary later time
than the point where the file stopped being logically interesting.

concurrent writers: semantic writers and mutexed out by the tree write
lock. stat cache updaters are not, but their changes are discarded if a
semantic writer has an update. So if you run 'bzr st' twice, at the same
time, you'll have two stat cache updaters, each with the same new
dirstate. You could have one semantic writer and one stat cache updater
with the same changes. Anyhow, assuming two stat cache updaters:
They then both write
12345
12345.current
they both attempt
mv current 12345.check
writer A wins, and may go on to complete the transaction, or may
rollback if a semantic update had occured [busy system hey :)]
writer B loses and may either spinlock or rollback(because its only a
stat cache update)

What happens next is the problem. If A inserts 12345 (by renaming
12345.current -> current) and B rollsback, B will remove 12345 and
12345.current. Now, today in my code B doesn't rollback, it just queues
to complete and then when it acquires current it finds that
12345.current is missing and will error out without taking further
actions. This will leave the system needing recovery, but its still well
defined.

I'd like to avoid these conditions altogether.

> >> Does this write a whole new copy on each update, or incremental files?
> >
> > Full copy. Theres a bug of stuff we /can/ do in a new tree format; I've
> > picked a single conceptual bug which has many ramifications and tried to
> > solve that. Some other ones I've not put any coding effort into are:
> >  - write less on partial updates
> >  - only store the left most parent
> >  - persist an id2path map
> 
> and, for me, getting away from any need to write on logical read
> operations would be great.

We can do that if we stop using a stat cache, but doing so seems to
imply that 'touch iso; bzr st; bzr st' will be a very slow process the
second time.

> > We have a requirement from Mark to fix the way we manage and announce
> > development formats, and I think that taking this new format *out of
> > development* must be gated on that; that is different to the format UI
> > though - though perhaps that is the actual thing you meant?
> 
> Mark, AfC and others have raised the troubles caused by format
> transitions.  We should not ask people to go through this again.
> 
> I didn't mean "before posting absolutely anything about it" but I do
> think we should improve it before putting it on track to be default,
> or before asking for widespread testing.

I do think we should get some considered testing from key use cases
before we consider this 'done', otherwise it will idle and neither be
completed or discarded. Specifically, having representative users on
NFS, AFS and SMB confirm that it doesn't eat their data would be nice.

We're not there yet anyhow, its a sketch - a nice sketch, but a sketch.

-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090930/7268acd3/attachment.pgp