attn: robey - urgentish paramiko bug
Robey Pointer
robey at lag.net
Fri Mar 10 06:19:28 GMT 2006
On 8 Mar 2006, at 6:28, Robert Collins wrote:
> time ../../versioned-file-performance/bzr branch
> sftp://people.ubuntu.com/home/robertc/public_html/baz2.0/
> integration.asknit ../integration-from-sftp
> bzr: ERROR:
> exceptions.AssertionError:
> at /usr/lib/python2.4/site-packages/paramiko/sftp_file.py line 92
> in _read_prefetch
> Killed by signal 1.
>
> Robey, this occurs with prefetch on, and I'm testing w/prefetch off
> but
> it looks happier already.
Work is keeping me very busy this week, but Robert and I chatted
about this on IRC, so I wanted to give an update to the list. The
key point is:
> This is with knits which use Transport.readv heavily:
> f = self.get(foo)
> f.seek(distance)
> f.read(length)
The "get()" in Transport currently turns on prefetch, which
immediately starts fetching data from the file. In some cases, the
seeks will jump back to an earlier point in the file. Current
paramiko (1.5.3) assumes prefetched files will be read somewhat
linearly -- at least, that seeks will always be forward.
Robert worked around this by re-ordering the seeks to always be forward.
I have a version in paramiko's bzr head <http://www.lag.net/paramiko/
bzr/paramiko> which allows random seeking after prefetch, though it
will throw away a block of prefetched data after it's read, so
reading the same block multiple times will incur new server round-
trips. We agreed this was a fair compromise.
An unresolved issue is: Should the SSH transport be turning on
prefetch right away after opening a file (via Transport.get)? I'm
leaning toward "no", since pre-fetching a file immediately causes the
entire file to be downloaded, even if we only want a tiny piece of
it. But I don't know the common use case of the new knit stuff yet.
robey
More information about the bazaar
mailing list