attn: robey - urgentish paramiko bug

Robey Pointer robey at lag.net
Fri Mar 10 06:19:28 GMT 2006


On 8 Mar 2006, at 6:28, Robert Collins wrote:

>  time ../../versioned-file-performance/bzr branch
> sftp://people.ubuntu.com/home/robertc/public_html/baz2.0/ 
> integration.asknit ../integration-from-sftp
> bzr: ERROR:
> exceptions.AssertionError:
>   at /usr/lib/python2.4/site-packages/paramiko/sftp_file.py line 92
>   in _read_prefetch
> Killed by signal 1.
>
> Robey, this occurs with prefetch on, and I'm testing w/prefetch off  
> but
> it looks happier already.

Work is keeping me very busy this week, but Robert and I chatted  
about this on IRC, so I wanted to give an update to the list.  The  
key point is:


> This is with knits which use Transport.readv heavily:
> f = self.get(foo)
> f.seek(distance)
> f.read(length)

The "get()" in Transport currently turns on prefetch, which  
immediately starts fetching data from the file.  In some cases, the  
seeks will jump back to an earlier point in the file.  Current  
paramiko (1.5.3) assumes prefetched files will be read somewhat  
linearly -- at least, that seeks will always be forward.

Robert worked around this by re-ordering the seeks to always be forward.

I have a version in paramiko's bzr head <http://www.lag.net/paramiko/ 
bzr/paramiko> which allows random seeking after prefetch, though it  
will throw away a block of prefetched data after it's read, so  
reading the same block multiple times will incur new server round- 
trips.  We agreed this was a fair compromise.

An unresolved issue is:  Should the SSH transport be turning on  
prefetch right away after opening a file (via Transport.get)?  I'm  
leaning toward "no", since pre-fetching a file immediately causes the  
entire file to be downloaded, even if we only want a tiny piece of  
it.  But I don't know the common use case of the new knit stuff yet.

robey





More information about the bazaar mailing list