[MERGE] Transport.non_atomic_put
John Arbash Meinel
john at arbash-meinel.com
Fri Aug 18 23:43:19 BST 2006
Martin Pool wrote:
> On 18 Aug 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:
>> Attached is a patch that creates a Transport.non_atomic_put() command.
>> Along with a default implementation, and custom implementations for
>> Local and SFTP transports. (Ultimately we should add one for FTP as well).
>>
>> This is in response to my earlier discovery about why 'push' is slow,
>> and I'm trying to get the updates written in small chunks.
>
> Good idea; +1 with comments.
>
>> === modified file bzrlib/tests/test_transport_implementations.py // last-change
>> ... d:john at arbash-meinel.com-20060818193050-9da47e496bc0166f
>> --- bzrlib/tests/test_transport_implementations.py
>> +++ bzrlib/tests/test_transport_implementations.py
>> @@ -144,6 +144,29 @@
>> self.assertRaises(NoSuchFile,
>> t.put, 'path/doesnt/exist/c', 'contents')
>>
>
> I think tests are easier for people to maintain and understand if each
> case just tests one thing. In particular if something new is failing
> it's much easier if the test name tells you what it is. So even though
> this test is quite clear as it is, perhaps it can be pushed a bit
> further.
>
>> + def test_non_atomic_put(self):
>> + t = self.get_transport()
>> +
>> + if t.is_readonly():
>> + self.assertRaises(TransportNotPossible,
>> + t.non_atomic_put, 'a', StringIO('some text for a\n'))
>> + return
>
> So perhaps this should be split out into a separate
> test_nonatomic_put_on_readonly_fails, since it is a separate
> requirement. Maybe that's taking it too far.
Well, these tests are run across all permutations. We need to check
is_readonly() and return. I can have it be a separate test, which just
returns for non-readonly transports.
This was how other functions were tested. But I can do it either way.
...
> Well, if you think it usually will not exist, it might be better to
> always try the mkdir and ignore EEXISTS.
No, this is because if I delay creating knit files, I have to make the
hash prefix at a different time. (On the first add to the knit data most
likely).
And it has the same probability of existing as it does now. Low for a
little while, but then slowly more and more common.
See my follow up patch about how I wrote the changes which allow
non_atomic_put() to create the parent directory.
And there, we need to be extra careful about directory permissions,
since that would be yet another parameter, etc.
>
> Will the caller usually have a file-like object already? Perhaps we
> should have non_atomic_put_bytes, so as to avoid creating and unwrapping
> a StringIO object. Perhaps that should be deferred and done to all
> methods at once.
Not sure. So far, the callers have had strings, and lines of strings. So
we would still need to collapse the lines of strings.
But yes, all callers so far have something in memory, not a real
file-like object.
...
> What do you mean by 'sticky bit' here? To me it is the 01000 bit, and
> there's no such thing as the 'group sticky bit', and anyhow the umask
> never affects them. I don't see why we'd be wanting to set either the
> sticky bit or the setgid bit on plain files...
Right. I was commenting the wrong place. The real problem with the
sticky bit is on directories, not on files. That is the 02000 bit.
Nonetheless my attempt at enabling shared directories without any custom
hacking of sftp servers doesn't work. Because whenever we create a new
dir, we either lose the group write bit, because of the umask, or we
lose the group sticky bit, because of the chmod to the directory.
So we *might* want to remove a lot of the permission code, and just fall
back to expecting people setting umask correctly. I don't really think
we want to remove permissions, because it still helps for people with
local shared directories. But it doesn't help as much as I would like.
...
> Telling users to set the umask properly might be OK locally but it may
> not be easy for an sftp server - they'll need to work out what shell
> script, if any, gets run before the sftp server runs. I think in many
> cases they would need the sftp server's sysadmin to change it globally.
Right. But we are screwed either way.
https://launchpad.net/products/bzr/+bug/50568
>
> Should the close() be in a finally block?
>
I'll look close at the close(). But this is in SFTP, if we have an
exception, we may have already lost the connection. But the old _put()
code uses an exception catcher to ensure the file is closed, so I can
make sure to close in a finally here.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060818/5755090d/attachment.pgp
More information about the bazaar
mailing list