[MERGE] Fix ability to use IIS as a dumb HTTP server. (fixes #247585)

Vincent Ladeuil v.ladeuil+lp at free.fr
Mon Jul 14 10:19:39 BST 2008


Thanks a lot for working on this, but see some comments below.

>>>>> "Adrian" == Adrian Wilkins <adrian.wilkins at gmail.com> writes:

<snip/>

    Adrian> +            
    Adrian> +        # parameters in the header all get run through rfc822.unquote
    Adrian> +        # so therefore our boundary strings should too

Absolutely not.

The boundary *definition* appears in headers so it MUST be
unquoted.

The boundary is then *used* and in these places it MUST not be
unquoted.

My intuition is that IIS is not buggy but that you are tricked by
a proxy there.

The bug, as you pinpointed it, is that:

- the boundary is not quoted in the headers as it should be,

- it is unquoted as per RFC822 because it looks like a quoted
  string (http://rfc.net/rfc822.html#s3.3. mentions that "<" and
  ">": "Must be in quoted-string, to use within a word."). So I
  don't think the python module is buggy here.

- being unquoted when it shouldn't have been, it doesn't match
  with the boundary lines.

Unquoting the boundary lines is the wrong place to fix the bug,
it will make bzr fail to match when the boundary definition has
been correctly quoted in the headers.

I'd like to better diagnose the problem before accepting a fix
for it.

Since it will be difficult do reverse the unquoting after the
fact, the fix *may* be to first try the raw boundary line, and if
it doesn't match, but only then, try with an unquoted version as
a *workaround*.

Can you provide some ethereal/wireshark traces and some .bzr.log
of a command run with -Dhttp ?

I just made a quick experiment with the following script:

from bzrlib import (
    debug,
    trace,
    transport,
    )

import bzrlib.transport.http._urllib2_wrappers as u2

u2.DEBUG = 3

def my_mutter(fmt, *args):
    print fmt % args

trace.mutter = my_mutter

debug.debug_flags = set('http')

t = transport.get_transport('http+urllib://download.microsoft.com')
l = list(t.readv('download/0/0/3/003b0b36-e61f-4c79-93c1-637c91fd40c7/Secure_Computing_Mod1_500k.wmv', [(0, 10), (65536, 12)]))

print '%r' % l

And here is the decoded wireshark trace:

GET /download/0/0/3/003b0b36-e61f-4c79-93c1-637c91fd40c7/Secure_Computing_Mod1_500k.wmv HTTP/1.1
Accept-Encoding: identity
Host: download.microsoft.com
Accept: */*
User-Agent: bzr/1.6b3 (urllib)
Connection: Keep-Alive
Pragma: no-cache
Cache-Control: max-age=0
Range: bytes=0-9,65536-65547

HTTP/1.1 206 Partial Content
Date: Mon, 14 Jul 2008 09:16:09 GMT
ETag: "eefde6313290c81:8037"
Last-Modified: Thu, 27 Mar 2008 17:44:24 GMT
Accept-Ranges: bytes
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Content-Type: multipart/byteranges; boundary=D99BE0CCF2B
Connection: close


--D99BE0CCF2B
Content-Type: video/x-ms-wmv
Content-Range: bytes 0-9/16033114

0&.u.f....
--D99BE0CCF2B
Content-Type: video/x-ms-wmv
Content-Range: bytes 65536-65547/16033114

.
..~..d5j..
--D99BE0CCF2B--


The magic '<q1w2e3r4t5y6u7i8o9p0zaxscdvfbgnhmjklkl>' string you
mentioned is not there, which makes me suspect some proxy
tricking you there.


   Vincent

BB: resubmit



More information about the bazaar mailing list