[rfc] [patch] pycurl transport

Fri Jan 13 08:33:39 GMT 2006

On 11 Jan 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:

> I personally think that seek() + read() is a really poor way of
> describing that you want to read specific chunks of a file. It works
> (and fairly well) for local files. But for remote files, you would want
> to do all sorts of read-ahead/pipelining/etc. And you aren't telling
> anyone what your plan is, even though you have probably created one
> before you issue the first seek.
> 
> As an example, paramiko would really like to set the prefetch flag if
> you are going to be reading a file multiple times in a row. But that
> would be wasted bandwidth if you are going to read a little, seek, read
> a little more, etc.
> 
> And http can't even seek. I'm not positive if seeking is out-weighed by
> round trip time. I'm sure for smaller than some specific read size, it
> is cheaper to read the whole thing rather than seeking inbetween.
> But doing a single 'give me this range, and this range, and this range'
> should be reasonably efficient.
> 
> But that was also why I wanted the _multi() functions, since you can do
> a little bit of planning, then make a batch call, and with generators,
> you can even do a little bit of work while the information comes in.
> Robert hasn't convinced me that this is evil yet, though I do believe
> all the functionality will end up removed.

Those are definitely important things to consider in formats that want
to read just partial files.  HTTP is probably the most constrained and
also the most important protocol.  If we could have a transport protocol
where the client specified the range of bytes to read to a kind of
_multi function that might work quite well.

-- 
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060113/8093da23/attachment.pgp