[MERGE] repository write groups

Tue Jul 31 05:11:28 BST 2007

On 7/30/07, Robert Collins <robertc at robertcollins.net> wrote:
> On Mon, 2007-07-30 at 22:16 -0500, Martin Pool wrote:
> > Great, so say that!  "You are only allowed to insert data into a
> > repository during a write group", if that's the case.
>
> I'll add that to the docstring for start_write_group. I do think they
> should be fully self explanatory without needing external docs outside
> the API docs.

Yes, but I think the overall docs should explain the main concepts, and
"what's a write group" deserves to be answered.  If it's "a thing you must
be inside before writing" then that's not a detail of start_write_group,
but rather something you need to use the repository at all.

> > So it does sound a bit like this is really the write group, and it
> > should implicitly take (or increment) a write lock if necessary.  "I'm
> > starting writing... I'm done."
>
> Uhm not really.
>
> Consider commit:
> lock_write
> start_write_group
> read basis data and per file graph data and so on while we
>  - insert texts for new file versions
>  - insert an inventory
>  - insert a revision
>  - insert a signature
> ** at this point all the new non index data is in a temporary pack file.
> commit_write_group
>  - finalise indices
> ** at this point all new data is in temporary files
>  - physical lock taken
>  - name allocated
>  - new data renamed into place
>  - physical lock removed
> use new revision data for post commit actions like sending emails,
> updating the basis inventory in the working tree
> unlock
>
> If you are saying 'start_write_group should do self.lock_write()', I
> don't think so. Maybe I'm wrong but it doesn't line up all that closely.

No, that wasn't what I meant.  What I meant was that it would implicitly
take a lock if _and when_ necessary.  For your example, it's during
finalization.  For another it might be the whole time, or not at all.

With this change, the caller doesn't know when the repository should be
locked.  It is kind of an inversion or displacement of control, which is a
good thing because it allows more room for variation by the repo.  But if
the caller no longer controls when the repository is locked, why are they
calling lock_write?

One reason might be that they want to rely on keeping some things in
memory so that they can more cheaply do the hooks and update the wt after
the write group finishes.  Maybe they should do lock_read and do the write
group inside that?

-- 
Martin