[MERGE] repository write groups

Tue Jul 31 16:17:52 BST 2007

On 7/30/07, Robert Collins <robertc at robertcollins.net> wrote:

I'm glad you think they're good questions.  Getting this right, I think,
is important to being able to keep repository-using code clean while
varying the repository.  I don't mind putting in write groups without
getting all the way there, so please resend the patch with the docs, and
we can talk maybe a bit more.

> On Mon, 2007-07-30 at 23:11 -0500, Martin Pool wrote:
> > On 7/30/07, Robert Collins <robertc at robertcollins.net> wrote:
> > > On Mon, 2007-07-30 at 22:16 -0500, Martin Pool wrote:
> > No, that wasn't what I meant.  What I meant was that it would implicitly
> > take a lock if _and when_ necessary.  For your example, it's during
> > finalization.  For another it might be the whole time, or not at all.
> >
> > With this change, the caller doesn't know when the repository should be
> > locked.
>
> I disagree. The caller should lock around the period they want
> consistent views of data, just as currently.

As you say, the names are causing confusion.  "Lock" means "exclude other
people from writing to this": that's what it means in general, and that's
what the lock elements in our control files do.  My point is that we're
changing it so that the caller no longer directly controls that or even
sees when it's happening.  Instead, they say when they want to insert data
or to have a consistent view or whatever, and the repository decides if
and when it should lock.  No?

> >   It is kind of an inversion or displacement of control, which is a
> > good thing because it allows more room for variation by the repo.  But if
> > the caller no longer controls when the repository is locked, why are they
> > calling lock_write?
>
> I think our method names may be adding confusion, which is a shame,
> because its less clear. But we've got I think reasonable solid concepts
> in play in the code base.
>
> Callers call lock_write to get:
>  - the ability to insert data (if you can't lock-write, you cannot
> insert)
>  - a consistent view on the data - in principal data cannot be *removed*
> while anyone has a write lock, regardless of the repository format - and
> new data inserted by other writers is ignored. In practice until we have
> a gc command implemented we can't be at all sure we have this nailed. ce
> la vie.

On the first: you can't write til you call start_write_group.  So
lock_write just gives you permission to call that function, which seems a
bit... bureaucratic.

The second would be fine, except that it's not actually true. :-)  If we
wanted to guard against data being removed while it's in use, we would
presumably want that when reading as well as when writing, so would need
to write a lock even when reading.  To me this says more about the
conceptual difficulties of doing gc-in-place with non-locking readers.
For newer formats, we don't want to hold a physical lock for the whole
time, so we can't use it to exclude gc.

> > One reason might be that they want to rely on keeping some things in
> > memory so that they can more cheaply do the hooks and update the wt after
> > the write group finishes.  Maybe they should do lock_read and do the write
> > group inside that?
>
> That requires upgradable locks in some formats, which will be tricky to
> do for some underlying formats. The big advantage of keeping our current
> interface and not shuffling things around is that it plays nice with
> everything from our old all-in-one formats up to the latest stuff I'm
> working on, and with svn/git/hg as well.

The weave formats where we have a read lock are now pretty old, and we
could deprecate writing to them...

> I think that the direction and questions you are asking are good ones,
> but that it would be very easy to define how it should work for an ideal
> pack based repository, yet make it much harder to thunk well for older
> formats or some foreign formats. To that end I propose doing nothing in
> the short term and keeping with:
> - lock_read to get a cache context
> - lock_write to get a cache context and the ability to call
>   start_write_group
> - start_write_group et al are essentially no-ops for repositories where
>   lock_write takes out a mutex, and core functionality for ones where
>   actual transactional functionality can take place.
>
> Which is to say, that callers using large scale operations like
> 'commit', 'merge' etc make no changes, callers doing fine grained things
> like 'add_inventory' will need to be changed to use write groups, or
> better yet the Builder style interface for adding commits.

It seems like for new format we really just want two operations the client
can ask for: "start caching/discard cache" and "start writing/finish
writing".

I think the main concrete change I want is that start_write_group should
not insist on the caller previously taking a lock.  Then at least we can
move away from having the callers know about locking.  It can implicitly
take a write lock if the format is one which needs it.  I just don't think
it should be part of the start_write_group contract.

-- 
Martin