prefetch still broken with readv and paramiko 1.6.1

Wed Jul 26 20:14:33 BST 2006

On 26 Jul 2006, at 5:14, John Arbash Meinel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robey Pointer wrote:
>>
>> On 25 Jul 2006, at 9:28, John Arbash Meinel wrote:
>>
>>> Well, I've found a few more bugs in the readv/prefetching logic.
>>
>> Would you try the current bzr trunk?  Located here:
>>
>>     http://www.lag.net/paramiko/bzr/paramiko
>>
>> I took a 90 degree turn, decided to stop explaining how readv is  
>> meant
>> to be used, and  tried to adapt it to how you're using it.  So I'd
>> appreciate feedback on whether that makes it work right for you.
>>
>
> I do believe I understand how readv is supposed to be used, and am  
> using
> it correctly.

I didn't mean that to come out sounding snotty.  What I meant was:  
Instead of trying to tell people how I think they should do things, I  
decided to let people do what they think is best and try to make that  
work the way they think it ought to.

>
>> I still think that in general you won't want to use prefetch when you
>> know you're just going to readv() a few sections of the file.  But  
>> that
>> should at least work now and not be as huge a penalty as it was  
>> before.
>>
>
> I'm not using readv() with prefetch(). I do understand they overlap.
> What I was *testing* is that if you are downloading *most* of the file
> using seek+read, it is actually faster to do a prefetch(). But if you
> aren't downloading most of the file, it is much better to just read  
> the
> sections you want.
>
> And the best is to use readv() which does an async request for just  
> the
> sections you want.

Sorry, I'd thought one of your bug reports was related to using  
prefetch() immediately before readv().

> You've been doing fine for me. Maybe I haven't been explaining what  
> I'm
> doing well. The round trip overhead of doing lots of 'seek + read'  
> calls
> is sufficient that if you are getting some large fraction (say >  
> 90%) of
> a file, it is faster to just request the whole file, and break it  
> up on
> the local side.
>
> Now, my test setup has biased this as well, since I was using a  
> loopback
> delay (using 'tc' and 'netem'). Which means that though a single ping
> takes 50ms, data can stream at a very high rate. Which is why I'm
> re-doing the test against a real server.
>
> I'll look into you changes in the trunk of paramiko. The only thing I
> want you to change about paramiko's readv() is for it to handle the
> 32K/64K bug. Because a single request can go above that.

That should be done in the trunk: it silently breaks the requests  
into smaller chunks if necessary, building them back up as they come in.

I also fixed it (knock on wood) so that you can do a prefetch() and  
then readv() without setting off fireworks.  It should even try to  
use prefetch buffers when a readv overlaps with chunks of the file  
that have been prefetched (or are in the process of being prefetched)  
but haven't been read by the app yet.

robey