sftp tests are slow

Robey Pointer robey at lag.net
Wed Jan 11 19:10:28 GMT 2006


On 10 Jan 2006, at 12:23, John Arbash Meinel wrote:

> Robey Pointer wrote:
>> I changed the subject because a bunch of different threads had  
>> the  same
>> meaningless subject line. :)
>>
>>
>> On 9 Jan 2006, at 18:05, John A Meinel wrote:
>>
>>> Robert Collins wrote:
>>>
>>>> Yes, this is slow. We chatted on IRC about this - its about 10  
>>>> times
>>>> slower than starting openssh... I find it hard to believe that   
>>>> this is
>>>> intrinsic to python - I have to regard it as a bug in the stub   
>>>> server.
>>
>>
>> 10 times is hard to believe -- what did you use to test that?
>
> Tested the time to run:
>  sftp = bzrlib.transport.get_transport('sftp://localhost/')
>  sftp.get('a').read()
> both with _ssh_vendor = 'none' and _ssh_vendor = 'openssh'
>
> Versus "time ssh localhost cat a"
> $ time ssh localhost cat a
> real    0m0.142s

Disclaimer with my results: your machine is much faster than my [old]  
iBook. :)

I get about 1.1s for 'time ssh localhost cat a' and from 1.1 to 1.2s  
for an equivalent paramiko call to exec_command.  Actually I'm a  
little surprised that it did well there.

Using this:


> python
>>>> import bzrlib.transport
>>>> import time
>>>> def do_time():
> ...   tstart = time.time()
> ...   sftp = bzrlib.transport.get_transport('sftp://localhost/')
> ...   sftp.get('a').read()
> ...   tdone = time.time()
> ...   return tdone - tstart
> ...

... came out significantly worse, 50% slower for vendor "none" than  
for vendor "openssh".  (average: paramiko = 1.36s; openssh = 0.9s)   
I'm at a loss to explain the difference, really.

I did an lsprof on the paramiko version of the test, which had a few  
interesting things in it (inflate_long takes 1ms each time?!) but no  
obvious smoking gun.  I'll probably poke at that more later.


> And that compared with the fact that a simple SFTPServer test, with  
> very
> little going on, takes >1s to complete.

Maybe we have different meanings for "very little going on" ;) but  
there's a heck of a lot going on during those tests.  That's why  
using openssh doesn't make so much difference there.  We're not just  
doing one side, but both sides of the handshake, creating several  
folders and control files, then closing up and tearing down again.   
So there's some CPU-intensive stuff and then some back-and-forth  
latency.


>> Last night I ran the (unmodified) "selftest test_sftp" tests on my
>> Linux box under lsprof and posted the results here:
>>
>>     http://www.lag.net/~robey/test_sftp.html
>
> I seem to get timeouts trying to connect to this page.

Maybe they were having an outage at the time -- the site was up at  
least part of the day yesterday.  Try again? :)


> It might just be more sticker shock to watch 'bzr selftest -- 
> verbose test_sftp' and see all of those tests taking 1000+ms, and  
> most of the other tests taking < 10ms.

They take a long time.  It might be worth punting them into an "-- 
all" option so they aren't normally run, but do get run when you want  
a thorough test and don't mind going out for coffee while it runs.   
Hopefully the test suite will keep growing, too.

Basically if it comes down to "should we optimize the unit tests?"  
I'd rather focus on optimizing the core code.

robey





More information about the bazaar mailing list