branch locking mk2.
Robey Pointer
robey at lag.net
Mon Feb 20 07:59:08 GMT 2006
On 19 Feb 2006, at 7:08, John A Meinel wrote:
> Robey Pointer wrote:
>>
>> On 15 Feb 2006, at 20:09, John A Meinel wrote:
>>
>>> Robey Pointer wrote:
>>>>
>>>> The name "fancy_rename" implies (to me) something like rename,
>>>> but with
>>>> added magic. It sounds like it's really just an attempt to
>>>> implement
>>>> "atomic_rename" of files for Transports that don't have it.
>>>> Having it
>>>> in osutils further implies that it's related to the local OS.
>>>>
>>>> What if we moved it into Transport.__init__.py and called it
>>>> "atomic_rename", to emphasize that it's really a base
>>>> implementation for
>>>> Transports that don't have an atomic rename? Then
>>>> Transport.rename (or
>>>> .replace) could promise to always be atomic.
>>
>> By the way, knits aside, did anyone have any comments on this?
>>
>
> We also use this elsewhere, like when merging changes into the working
> tree. Growing up on POSIX, people expect rename to be atomic. (And
> that
> you can delete files even if someone has them open, etc).
Right. Sounds like you're for it, then? (Just want to make sure,
since your answer is a little ambiguous.)
Part of the contract for Transport would now be that renames are "as
atomic as possible". And obviously you would never do a rename
except through a Transport.
>
>>
>>>> Another issue is that if file-renaming isn't atomic on some
>>>> platforms,
>>>> and we're relying on atomic behavior, this seems... bad. I
>>>> don't have
>>>> an answer. But it troubles me. Over SFTP, you can't even
>>>> guarantee
>>>> that the race is short -- we may get disconnected from the
>>>> server in the
>>>> middle of the atomic_rename, and now there's no weave file. Or
>>>> the user
>>>> may just have hit ^C. Scary. Will the append-only knit stuff
>>>> make this
>>>> a non-issue?
>>>
>>> Append only weaves would make it much less of an issue, since we
>>> would
>>> append to the files that exist, rather than replacing them with
>>> new ones.
>>>
>>> There will probably be some control files that are still atomically
>>> replaced, like revision-history (as long as that exists).
>>> Also, I don't know how we are trying to keep knit index files
>>> intact.
>>> I'm guessing we just expect that adding 1 line will be an atomic
>>> action,
>>> as long as we add less than X number of bytes.
>>> I can think of some things we could do (like adding a checksum +
>>> line-length at the beginning of each line, along with a line
>>> delimiter
>>> before the line (a simple newline could do)).
>>> Then we can throw out index lines that don't match their
>>> checksum. And
>>> the delimiter makes sure that we know where the entries start &
>>> stop.
>>
>> Seems like we rely on the atomic_replace, so trying to think of
>> some way
>> to make it work correctly...
>>
>> With all transports using the same locking, we'll at least know that
>> only one person can be updating metadata at a time. So maybe
>> atomic_rename could always use the same temporary, like
>> "<filename>.old". The worst interruption that could happen then is:
>>
>> <updater #1>
>> open('important.weave.temp-foobar-90210', 'w').write(stuff)
>> rename('important.weave', 'important.weave.old')
>> ---ACK! disconnected
>>
>> <reader>
>> open('important.weave', 'r') => error
>> open('important.weave.old', 'r') => ok, just go with that
>>
>> <updater #2>
>> open('important.weave', 'r') => error
>> open('important.weave.old', 'r') => ok, so fix:
>> rename('important.weave.old', 'important.weave')
>>
>>
>> This would require a bit more work when reading files that may be
>> atomically replaced. We'd have to check for failed updates and
>> move the
>> old copy back.
>
> Actually, this isn't all that bad. Just instead of renaming it out of
> the way to a temp location, you rename it to a specific location.
>
> The problem is what happens when you still have the old location, but
> you also have the final location? And you want to overwrite the final
> location. Do you just always issue a 'remove(foo.old)' thereby doing:
>
> remove(foo.old)
> put(foo.tmp.XXXXX)
> rename(foo, foo.old)
> rename(foo.tmp.XXXXX, foo)
> remove(foo.old)
>
>
> Actually, I just realized something. As long as we only put() when we
> hold the write lock, with the new locking mechanism, stale locks don't
> disappear (like OS level locks).
> So the only time you need to really do this is when breaking the lock.
> It would mean that a broken branch could not be read by a read-only
> operation (without switching to probing for foo, and foo.old). But we
> could repair the tree.
>
> Determining what is missing and what needs to be fixed might be a
> little
> tricky, we could write a transaction log in the new locking directory,
> but that has all the performance penalties of lots of round trips, and
> to be safe, you must effectively fsync() before you make a change.
> Though if you knew you were going to be making 50 changes, you could
> write that all out at once. (No need to write each one individually).
>
> And probably we don't need to remove(foo.old) until we get a
> FileExists
> exception.
>
> So, I'm +1 on the idea of moving to foo.old to replace behind it. I am
> surprised that you can't rename a file over another file. Since
> that is
> very posixy. I would have expected them to support it, and do tricky
> renaming on platforms that don't support it.
Me too. It's even worse in later SFTP drafts, where they try to let
the client request one of 3 different rename behaviors, any of which
the server can refuse to implement. Ugh.
And I agree: cleaning up (or restoring from) *.old files would only
have to occur after breaking a lock. Maybe breaking a lock could
perform a lightweight version of "check", to make sure all the files
are there, and clean up any *.old files it finds.
robey
More information about the bazaar
mailing list