[UGLY HACK] Proof of concept multipart/byteranges support and connection sharing

Tue May 23 13:58:12 BST 2006

On 5/23/06, Martin Pool <mbp at canonical.com> wrote:
> On 19 May 2006, Michael Ellerman <michael at ellerman.id.au> wrote:
>
> > The obvious optimisation for bzr at the moment is to make readv() do a
> > proper[1] HTTP range request. So last night I hacked up a _really_
> > horrible implementation, just to see what sort of speed improvement that
> > might get us.
>
> That's very cool; I hope we can get it in when it's less hacky.

I'll try and clean it up, I'm not exactly bathing in free time at the
moment though :)

> > [2] Unfortunately we're creating about 15 PyCurlTransport() objects, so
> > to see much improvement we have to share the Curl() object globally.
> > Yuck. Also it seems (??) you can't unset pycurl.RANGE/NOBODY, so we have
> > to have three Curl() objects, one for GET, one for HEAD and one for GET
> > + Range.
>
> No, you should be able to do
>
>   curl.setopt(pycurl.NOBODY, 0)
>
> to turn it off - ie to revert from a HEAD to a GET request.  And
> similarly for Range, I'd expect you can set it to an empty string or
> None to turn it off.

Yeah, you should. Running the following I see two HEADs go out,
although some part of libcurl thinks it's doing a GET because it waits
for the content and then times out. (this is on dapper btw)

#!/usr/bin/python

import pycurl, StringIO

url = 'http://michael.ellerman.id.au/bzr/plugins/shelf/README'

curl = pycurl.Curl()
sio = StringIO.StringIO()
curl.setopt(pycurl.URL, url)
curl.setopt(pycurl.WRITEFUNCTION, sio.write)
curl.setopt(pycurl.VERBOSE, 1)

# Start with a HEAD
curl.setopt(pycurl.NOBODY, 1)
curl.perform()

# This should be a GET .. but isn't
curl.setopt(pycurl.NOBODY, 0)
curl.perform()

cheers