prefetch still broken with readv and paramiko 1.6.1
John Arbash Meinel
john at arbash-meinel.com
Wed Jul 26 03:51:22 BST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Arbash Meinel wrote:
...
> In general, these results are indicative that using paramiko.readv()
> could save quite a bit of time, since it does async requests. (And can
> thus only requests the exact bytes). But we would need to solve the 64K
> boundary before we could enable it.
>
> John
> =:->
Just as a teaser, I have implemented code to actually do this.
(available in my readv-combining branch).
Initial tests with a local delay show a lot of promise. The attached
images tell the story. But basically 'prefetch' did help when doing a
full branch, and I found that using 'readv' helped even more.
'seek+read' is my current generic readv() implementation which is
combining ranges.
'seek+read+prefetch' is just doing the same thing, only calling
file.prefetch() first.
'readv' is just using paramiko's sftp.file.readv() on the raw offsets
that were requested. This is unsafe to do (per my earlier email), but I
wanted it for reference.
'readv combined' is one more level, which does the same collapsing as
seek+read, only it then splits up the hunks into 64K requests.
For those who don't like pretty pictures, the 'bzr branch' time drops
from 680s in bzr.dev down to 478s with seek+read, and 422
seek+read+prefetch. And 366s with readv combined.
'bzr pull' time (for pulling from revno 1800=>1865, or 390 revisions) is
128s with bzr.dev, 87s with seek+read, 127s with seek+read+prefetch
(yes, much worse), 84s with plain readv, and 78s with readv-combined.
So the combining doesn't help a short pull a whole lot, but it helped
the full 'bzr branch' time enough that I felt it was worth the complexity.
I'm currently running a real test with a remote host (ping of 34ms) to
make sure I wasn't just tuning things for high-latency but massive
bandwidth connections. (What happens when you enable latency on the
loopback).
I've prettied up the code a lot. And I'm almost ready to submit it for
review. The only thing left is to remove all of the environment
variables that I used to benchmark things, and it will be ready to
submit to the mailing list. (But I can't do that until tomorrow, because
I'm waiting for my performance tests to finish).
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFExtiqJdeBCYSNAAMRAgWAAJ9r8oesU4XuiKuTc28l7ZcmoDqcBQCfS0ab
fQaHlOLY49URF3a4LhCX/ek=
=i6LN
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: branch-sftp.png
Type: image/png
Size: 25888 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060725/86a59d1e/attachment.png
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pull-390-sftp.png
Type: image/png
Size: 22319 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060725/86a59d1e/attachment-0001.png
More information about the bazaar
mailing list