[MERGE] Fix ability to use IIS as a dumb HTTP server. (fixes #247585)

Vincent Ladeuil v.ladeuil+lp at free.fr
Mon Jul 14 18:03:40 BST 2008


>>>>> "Adrian" == Adrian Wilkins <adrian.wilkins at gmail.com> writes:

    Adrian> Vincent Ladeuil wrote:
    >> Thanks a lot for working on this, but see some comments below.
    >> 
    >>>>>>> "Adrian" == Adrian Wilkins <adrian.wilkins at gmail.com> writes:
    >> 
    >> <snip/>
    >> 
    Adrian> +            
    Adrian> +        # parameters in the header all get run through rfc822.unquote
    Adrian> +        # so therefore our boundary strings should too
    >> 
    >> Absolutely not.
    >> 
    >> The boundary *definition* appears in headers so it MUST be
    >> unquoted.
    >> 
    >> The boundary is then *used* and in these places it MUST not be
    >> unquoted.
    >> 

    Adrian> The problem in the Python lib is that the rfc822.unquote() function
    Adrian> considers <this> to be a quoted string, when the standard doesn't.

All right, then it's buggy. It unquoted the boundary when it
shouldn't have.

That's not a good reason to unquote the boundary line when *we*
shouldn't.

    Adrian> There is no harm in running the boundaries as parsed
    Adrian> through the unquote function as conformant
    Adrian> implementations do not contain quote characters as
    Adrian> specified by RFC2046 : 5.1.1

Right. It's also useless most of the time. That's why I ask you
to unquote it *only* if it doesn't match preceded by a comment
explaining why.

You put a lot of effort into explaining the various bugs involved
here, let's not lose that.

    >> My intuition is that IIS is not buggy but that you are
    >> tricked by a proxy there.
    >> 

    Adrian> This was tested straight from the IIS 7.0 install on
    Adrian> my local machine, with no proxy,

Great, thanks for taking the time to check that. I couldn't do it
myself so I trust you there :D

Adrian, don't lose time arguing.

I think your diagnosis is right.
I think your workaround is fine.

I just don't want to merge code that, *as its default code path*,
implements a workaround.

I'm fine with your patch, including the tests, as long as you try
your workaround only when the pristine boundary line doesn't
match, i.e.:

  if boundary_line != '--' + self._boundary + '\r\n':
      # Something is wrong with the boundary line. Since IIS is known to
      # use '<>' as delimiters and python rfc822.unquote wrongly delete 
      # them, let's try again with a mutilated boundary line
      boundary_line = self._unquote_boundary(boundary_line)
      if boundary_line != '--' + self._boundary + '\r\n':
          raise errors.InvalidHttpResponse(
               self._path,
              "Expected a boundary (%s) line, got '%s'" % (self._boundary,
                                                           boundary_line))


or something like that (untested).

One day, rfc822 module may be fixed, IIS may be fixed or new bugs
may appear here or elsewhere, I prefer to keep a base
implementation clean, since that's a guarantee that workarounds
may be easier to implement or that we can safely delete them.

   Vincent




More information about the bazaar mailing list