[MERGE] [BUG #165061] Force http.readv() to a maximum number of range requests.
John Arbash Meinel
john at arbash-meinel.com
Thu Nov 29 16:07:35 GMT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Vincent Ladeuil wrote:
>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
>
> john> Vincent Ladeuil wrote:
> >>>>>>> "robert" == Robert Collins <robertc at robertcollins.net> writes:
> >>
> robert> Robert Collins has voted resubmit.
> robert> Status is now: Resubmit
> robert> Comment:
> robert> I think its better to make many requests that to ask for more
> robert> data than the fudge factor requires.
> >>
> >> Agreed.
> >>
> robert> That is:
> robert> http._readv
> robert> for group in xrange(len(offsets), max_ranges):
> robert> for result in self._do_readv(offsets[group *
> robert> max_ranges:(group+1 * max_ranges)):
> robert> yield result
> >>
> >> Except two details invalidate this simple solution:
> >>
> >> - out of order offsets (but jam pointed me to sftp to handle
> >> that, with a risk of buffering too much but the consensus is
> >> that packs should not suffer from that),
> >>
> >> - transient errors handling (this bug showed us that the whole
> >> file can and will be downloaded which should avoided when
> >> possible).
> >>
> >> So yes, your proposal is the right direction and I'll try to
> >> implement it in one or two steps:
> >>
> >> - reuse the sftp_readv with slight adaptations*,
> >>
> >> - rewrite the http response handling to deliver data as soon as
> >> they arrive. This is needed and will massively reduce the
> >> memory consumption.
> >>
> >> I don't have a lot a free time these days so I'll look at
> >> implementing the first step quickly.
> >>
> >> Vincent
> >>
> >> *: needs adaptation since after aggregation the big chunks are
> >> split again to accommodate sftp limits.
>
> john> I think you could get rid of that in sftp even. It was just a workaround for an
> john> earlier version of paramiko which assumed that your requests would always fit.
> john> so 1.5.2 would need it, but 1.6+ doesn't (IIRC).
>
> Thanks for confirming that (I was a bit surprised when looking at
> both sftp.py* and paramiko), I just committed an adapted version
> that takes care of out of order offsets and transient errors.
>
> No tests yet and no http response rewriting, so memory
> consumption may be worse in some cases which were failing before
> the patch.
>
> Onn the other hand, it also fix bug #172071, so I'll continue on
> that one.
I think you need to check your bugs. I'm pretty sure this is a typo:
https://bugs.launchpad.net/bzr/+bug/172071
Especially since the bug is written in Spanish.
You probably meant:
https://bugs.edge.launchpad.net/bzr/+bug/172701
>
> Vincent
>
> *: You may want to clean up the code and the comments then ?
>
*I* want to do no such thing. :) But I'm happy to approve a patch.
This may not be the best location to have the comment but:
# The sftp spec says that implementations SHOULD allow reads
# to be at least 32K. paramiko.readv() does an async request
# for the chunks. So we need to keep it within a single request
# size for paramiko <= 1.6.1. paramiko 1.6.2 will probably chop
# up the request itself, rather than us having to worry about it
_max_request_size = 32768
Which is right at the beginning of SFTPTransport.
certainly we could add that as step (8) in the _sftp_readv() comment.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHTuPGJdeBCYSNAAMRAkVoAKDIQ2ySBjA5u7094BCsWwZPHuGTyQCglNsj
afrf49C86l6mIO+5PW1IMyg=
=ra6J
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list