[MERGE] Faster LocalTransport.put() and append() methods.

John Arbash Meinel john at arbash-meinel.com
Sat Jun 10 14:10:31 BST 2006


Robert Collins wrote:
...

>> We should at least be 'chmod' ing if we create the file, see my earlier
>> discussion about shared branches. And we need 'chmod' because of umask.
>> Either that, or we always run with umask '0000', which I think is worse.
> 
> I dont see why we need to chmod *except* when making a branch new file
> on disk.
> 
> We currently chmod in the following cases when a mode is specified...
> AtomicFile
> LocalTransport.append() with a new file
> LocalTransport.append() with an existing file
> 
> Thats for .bzr files...

I agree. We should only need to do so when creating a new file, but how
do we detect that?

> 
> I think this is wrong, and we should only chmod when a mode is specified
> AND:
> the umask will result in a different mode on the output file if we
> supply it ourselves
> AND
> we are making a new file not altering an existing one.
> 
> What do you think?
> 

I would be happy with that. Problem #1 is that I don't know of any way
to determine the current umask without setting it. (os.umask() will
return the old umask, but only after you set a new one).
I suppose at some startup time the LocalTransport could detect the
umask, and then reset it and save the value from then on. Detecting if
it needs to do a chmod, or whether it can just set the mode bits at open
time.

Back when I added chmod, it didn't effect performance very much (which
is why I also had flags which would allow you to disable setting chmod).
I had done benchmarks, and didn't see an effective difference. Now that
we may have cleaned up some other stuff, we may want to re-evaluate it.

We should also keep in mind how we would want to tune
SFTPTransport.put() and .append()
We already tuned the .put() call, so that we enable pipelining for the
file contents, and then call chmod() at the very end. Which has the
effect of blocking and flushing the bytes, and then chmodding. So, in
theory, it hides the latency of the extra round trip.

I'm not sure if we can get the umask for sftp operations. The
'mode=None' parameter in _sftp_open is actually for WRITE/CREATE/etc not
the permissions bits. I think the SFTPAttributes() parameter might allow
us to set the file open mode, which should let us do a similar trick.
If you look at the StubSFTPServer it uses:

fd = os.open(path, flags, attr.st_mode)

So we should be able to do the equivalent of an 'os.open()' call from
the client. We might just default to assuming that the remote umask is
'0022', which I think is a reasonable assumption. So we would need to
chmod if they want the g+w bit set or the o+w bit, but otherwise we can
safely just open files with the right mode bits to start with.

> 
> For files in the source tree Aaron and I have a patch which removes the
> current chmod behaviour for a similar chmod-once approach (the aggregate
> patch saved 30% on checkouts). The logic there should IMO be different,
> because users expect umask to override things like checkouts. Or at
> least, I do.
> 
> Rob
> 

I agree that working directory files should be controlled by umask.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060610/eb21c2ed/attachment.pgp 


More information about the bazaar mailing list