[MERGE] http+pycurl supports redirects

Wed Jul 19 13:54:52 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Ellerman wrote:
> On 7/19/06, John Arbash Meinel <john at arbash-meinel.com> wrote:
>> Michael Ellerman wrote:
>> > I don't understand what "body_is_header" is for, we only ever want the
>> > headers associated with the final response don't we?
>>
>> I just made it optional, in case we found something different.
> 
> OK. I'm more of a don't-add-it-until-it's-used kinda guy, but that's
> just me. I still don't grok why it's called "body_is_header" though?

Well, _extract_header isn't a pycurl only function. It is in
__init__.py, mostly because I use it as part of testing. (I parse the
saved headers, and pass it into handle_response).

If I move it into _pycurl, then I'd be happier not having it parameterized.

> 
>> > I also wonder if it should be a little stricter, and look for
>> > "\r\n\r\nHTTP" as the sign that a new set of headers is starting? I
>> > also think it should be a loop so we can handle a double/triple
>> > redirect.
>>
>> Actually, it is recursive. So it will handle as many as it needs to.
> 
> Oh right, duh. That limits it to some finite number of redirects ..
> but I guess we'll live with that ;)

Finite but very large, yes. Probably we should explicitly limit it to
something like 5 redirects. In case someone creates a redirect loop. :)
Not that it really matters because pycurl is the one *actually*
following the redirects. So by the time we get the data it is too late.

> 
>> I don't know if we want to be 100% strict about requiring '\r\n'. I
>> would be okay with '\n' or '\r\n'.
>> We could be more strict and require response.startswith('HTTP'). By the
>> time we get there, the '\r\n' has already been consumed by the rfc822
>> parser.
> 
> Yeah, not sure. It'd be nice to be somewhat tolerant of a stray blank
> line in the headers, even if that is illegal it might happen. Perhaps
> just require it start with 'HTTP '.
> 
> cheers
> 

Well, I currently do a 'strip()' and only if that is empty do I consider
there to be a body. So I'm already handling stray blank lines.
I could do 'strip().startswith()'.

But if we are going to go this route, I can actually also remove the
'strip first line'. I was porting the header function from
'curl/__init__.py' which is a python wrapper around pycurl (which ships
with with pycurl).

So I'll switch the function to being http only, and using a loop rather
than recursion. It'll clean it up a bit.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEviucJdeBCYSNAAMRAtUjAKCoLlR39VYtJoYYiDZen3pzu622KACfS51a
9/rH5jmyN7IlXFv6QLcJ0Qc=
=ya6+
-----END PGP SIGNATURE-----