[BUNDLE] let PyCurl transport read multiple ranges.

Thu Jun 8 17:54:25 BST 2006

On 6/8/06, Johan Rydberg <jrydberg at gnu.org> wrote:
>
> Hi,
>
> I spend a few minutes on Michael Ellermans proof-of-concept patch to
> let readv() to read multiple ranges at once.
>
> The differences between this patch and Ellermans are minimal;
>
>  * I've wrapped pycurl.Curl objects in a Curl object.
>  * Transport.clone() reuses the Curl-object from the base transport.
>  * Only one Curl-object is used per transport.
>  * Ranges are combined (i.e., 10-20,20-30 is combined to 10-30.)

Damn, we just had a mid air collision :)

I've actually been quietly reworking this code off in a corner, it's
not quite ready to go, there's no tests yet :}, but it's coming along.
YMMV.

Latest code is at: http://michael.ellerman.id.au/bzr/branches/http

I'd still like to do something with caching the Curl objects, so when
I get my stuff finished I'll try and merge your work to do that on
top. And also the combining of ranges is cool for various reasons.

Another thing we really need to do is some sort of detection of how
large a range header a server can handle. It looks like the default
size in Apache2 is ~8K per header (ie. the range string has to be <
than that), but it's configurable _downward_ by users. So it's quite
possible we'll hit servers with 4K or 2K or less.

We could just try a range of ~8k then if we get 400 (bad request) fall
back to no ranges. But it'd be nicer to try and discover the maximum
for a server. That is, start with 8k, and if we fail (400), drop to
4k/2k/1k/512 etc. We'd then want to remember that value for the server
as long as we can (life of bzr usually). Hopefully we won't piss off
too many Web admins by retrying continuously, but AFAICT there's no
better way.

cheers