plugin: sftp transport

Robey Pointer robey at lag.net
Wed Oct 19 19:51:00 BST 2005


I wish there were more hours in the day to do all the things I want  
to do. :)

I updated the SFTP plugin and put it in a bzr branch here:
     http://www.lag.net/~robey/bzr-plugins/sftp/


On 18 Oct 2005, at 14:35, John A Meinel wrote:

> [Robey wrote:]
>
>> It requires at least verison 1.4 of paramiko, though 1.5 is probably
>> better:
>>      http://www.lag.net/paramiko/
>>
>
> I know there has been some discussion (and I believe it was  
> decided) about
> using Twisted, which also provides sftp support. I'm guessing what  
> you've
> done can probably be adapted for it. But I know one reason I wasn't
> persuing it heavily, is that I was waiting for Twisted to be  
> integrated
> first.

A large part of it could probably be salvaged if we switch to  
Twisted, but I won't hold my breath for that.  Twisted is an event- 
based framework, so it would mean changing a lot of the flow of bzr  
to suit twisted.  (This is one reason I prefer synchronous code over  
event-based: Simpler is better.)


>> I implemented the write support, too, but did NOT test it yet
>
> Did you see the transport test suite, available in
>  bzrlib/selftest/testtransport.py?

I spent last night's and this morning's train rides playing with  
that.  Very cool.

I've added a stub (loopback) SFTP server to the plugin for unit  
tests, and hooked it up using the TestTransportMixIn.  Two tests  
fail, and they appear to be due to an impedance mismatch, so I'm not  
sure how to solve them yet: two tests are expecting a NoSuchFile  
exception to be raised, but get SFTPTransportError raised instead.

The problem is that paramiko is trying to emulate python's exception  
handling (converting SFTP errors into python exceptions), so it loses  
some information.  There's an SFTP error code for NO_SUCH_FILE which  
could map directly to the bzr NoSuchFile exception, but it's  
converted into a generic IOError on the way out.

The solution I'm leaning toward is to have paramiko define its own  
NoSuchFile exception, subclassed from IOError, and let the  
SFTPTransport plugin catch that and translate it to bzr's  
NoSuchFile.  That would require upgrading paramiko but that's  
probably not a serious issue.

In any case, the tests all seem to be "functionally" passing, aside  
from the exception mismatch.  I set the test to NOT be read-only,  
which improves my faith that this should work as a writable branch too.


>> Read and write locks aren't supported, just like with HTTP.  There's
>> probably something clever I can do there, assuming we only care about
>> read/write locking against other bzr-over-sftp instances, but I
>> haven't put much thought into it.
>
> Read locks would be nice, because it would let us cache things  
> locally,
> and guarantee that they aren't changing remotely. Write locks would be
> nice when we go to support "bzr push" to a remote branch. (and when
> Aaron's changes for remote bound-branches is implemented).
>
> I know the SFTP v6 spec supported all sorts of nice things, but I  
> believe
> everyone is using v3, and I don't know what that does and doesn't
> implement.

This is kind of a general problem, which I believe we should solve  
for all transports.  Basically there are tricks you can do even  
across a "dumb" filesystem to do locking -- arch did these and we may  
be able to steal from them.  As Jan Hudec points out in a different  
thread, it would be best if we could even support locking via one  
transport and seeing the lock via another.

As far as the SFTP implemented in the wild, there's no file locking,  
but there is an equivalent to "O_CREAT|O_EXCL" so all hope is not lost.

> It would be nice to have a directory with an __init__.py that also  
> has the
> test_suite(), so that running "bzr selftest" could also test the sftp
> plugin.

Yeah, I did that now.  I guess it was too optimistic to think I could  
keep it down to one file. :)

>> Oh yeah, this is mostly just ported from my last version, except
>> using the new Transport API (which has been cleaned up a lot --
>> thanks).
>
> I'm not really sure how the interface was cleaned up, perhaps I  
> moved the
> LocalTransport calls into a separate class. Or maybe I just missed  
> someone
> else's work.

I just meant compared to my last patch, which was before your  
Transport work got in.  Using the old "store" interface to implement  
new transports had a lot of redundancy.  This way is much better.

> I've looked over the code (briefly), and generally it looks good.  
> I'm a
> little curious why you have the global SYSTEM_HOST_KEYS loaded into
> save_host_keys(), but you don't use it.

Good point, too much cut-n-paste. :)

> Also, I would think that in _sftp_connect() we would preferentially  
> use
> "BZR_HOST_KEYS" over using SYSTEM_HOST_KEYS, since we are BZR, but  
> this
> probably isn't a big deal.

*shrug* Makes sense to me, so I swapped them.

> I also noticed that you never disconnect from the remote host. Now, I
> don't know if the disconnect automatically happens when the last  
> reference
> to self._sftp goes away, which is perfectly fine.

Yeah, it's using python __del__ magick so that SFTP file objects can  
be casually discarded just like python's built-in file objects.  The  
same applies up the chain to the (paramiko) Transport.

> We might also consider having a weakref dictionary of remote hosts  
> that we
> are currently connected to, and their connection object. I know that I
> believe everything will be called through "clone()" but I suppose  
> it would
> be possible for a second connection to be attempted. A weakref  
> dictionary
> lets you use the connection if it exists, but it also doesn't hang  
> on to
> it (letting it disconnect) when all real references go away.

It's a good idea.  I'll punt on that for now though, until I'm more  
convinced that the core of it is solid.

Thanks for the comments!  Sorry this email got so long.

robey





More information about the bazaar mailing list