[BUNDLE] let PyCurl transport read multiple ranges.

John Arbash Meinel john at arbash-meinel.com
Thu Jun 8 15:01:36 BST 2006


Johan Rydberg wrote:
> Hi,
> 
> I spend a few minutes on Michael Ellermans proof-of-concept patch to
> let readv() to read multiple ranges at once.
> 
> The differences between this patch and Ellermans are minimal;
> 
>  * I've wrapped pycurl.Curl objects in a Curl object.
>  * Transport.clone() reuses the Curl-object from the base transport.
>  * Only one Curl-object is used per transport.
>  * Ranges are combined (i.e., 10-20,20-30 is combined to 10-30.)
> 
> Because of a limitation to PyCurl (it is impossible to reset the RANGE
> option), we always do Range-requests.  This is really not a problem,
> since if only one range is requested, the data is returned in plain
> format (not a MIME multipart message.)
> 
> With the patch, and a caching DNS server, I was able to branch bzr.dev
> in 3m10s.  Without the patch, it took ~8 minutes.  And without the
> patch, and a caching DNS server, it takes ~15 minutes.
> 
> ~j
> 
> 

If you are doing it correctly, you shouldn't need a caching DNS server
for pyCurl objects. They cache their own requests (something like 60s).

Could you test that to make sure?

I would probably avoid "__readv" because the __ cannot be overridden or
accessed easily in the child.


I would also not create the object as Curl, perhaps 'CurlWrapper' so
that it isn't confusing what type of object you have.

You changed the code to unset the 'pycurl.FOLLOWLOCATION' flag, which is
a functional change. I'm not sure how we should handle this.

Otherwise the performance effect is very nice. Thanks for your work.

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060608/58377c05/attachment.pgp 


More information about the bazaar mailing list