Rev 3108: (vila) Fix #175886 by reading superfluous bytes sent by dumb http servers by chunk in file:///home/pqm/archives/thelove/bzr/%2Btrunk/
Canonical.com Patch Queue Manager
pqm at pqm.ubuntu.com
Thu Dec 13 19:14:44 GMT 2007
At file:///home/pqm/archives/thelove/bzr/%2Btrunk/
------------------------------------------------------------
revno: 3108
revision-id:pqm at pqm.ubuntu.com-20071213191437-c42q84tx9pf0e2tk
parent: pqm at pqm.ubuntu.com-20071213141047-tklbta8rymzfpj6y
parent: v.ladeuil+lp at free.fr-20071213175256-1tw7ao1l9555n46o
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Thu 2007-12-13 19:14:37 +0000
message:
(vila) Fix #175886 by reading superfluous bytes sent by dumb http servers by chunk
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/tests/test_http_response.py test_http_response.py-20060628233143-950b2a482a32505d
bzrlib/transport/http/__init__.py http_transport.py-20050711212304-506c5fd1059ace96
bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
------------------------------------------------------------
revno: 3107.1.1
revision-id:v.ladeuil+lp at free.fr-20071213175256-1tw7ao1l9555n46o
parent: pqm at pqm.ubuntu.com-20071213141047-tklbta8rymzfpj6y
parent: v.ladeuil+lp at free.fr-20071213154444-sd76q4uul3qn5ywh
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: trunk
timestamp: Thu 2007-12-13 18:52:56 +0100
message:
Fix #175886 by reading superfluous bytes sent by dumb http servers by chunk
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/tests/test_http_response.py test_http_response.py-20060628233143-950b2a482a32505d
bzrlib/transport/http/__init__.py http_transport.py-20050711212304-506c5fd1059ace96
bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
------------------------------------------------------------
revno: 3104.3.5
revision-id:v.ladeuil+lp at free.fr-20071213154444-sd76q4uul3qn5ywh
parent: v.ladeuil+lp at free.fr-20071213154042-77h01zei40qptsuv
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: 175886
timestamp: Thu 2007-12-13 16:44:44 +0100
message:
Fix typo.
modified:
bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
------------------------------------------------------------
revno: 3104.3.4
revision-id:v.ladeuil+lp at free.fr-20071213154042-77h01zei40qptsuv
parent: v.ladeuil+lp at free.fr-20071212210605-utjudvizj5v9qotc
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: 175886
timestamp: Thu 2007-12-13 16:40:42 +0100
message:
Add test.
* bzrlib/transport/http/_urllib2_wrappers.py :
Some cleanup.
* bzrlib/tests/test_http_response.py:
(TestHTTPConnection): Test the warning emission.
modified:
bzrlib/tests/test_http_response.py test_http_response.py-20060628233143-950b2a482a32505d
bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
------------------------------------------------------------
revno: 3104.3.3
revision-id:v.ladeuil+lp at free.fr-20071212210605-utjudvizj5v9qotc
parent: v.ladeuil+lp at free.fr-20071212161401-uo6j6ohcwkbj1h4p
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: 175886
timestamp: Wed 2007-12-12 22:06:05 +0100
message:
Jam's and Aaron feedback about bug #175886.
* bzrlib/transport/http/_urllib2_wrappers.py:
(Response.finish): Returned the number of bytes discared.
(AbstractHTTPConnection.cleanup_pipe): Warn the user about servers
that do not recognize range requests.
modified:
bzrlib/transport/http/__init__.py http_transport.py-20050711212304-506c5fd1059ace96
bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
------------------------------------------------------------
revno: 3104.3.2
revision-id:v.ladeuil+lp at free.fr-20071212161401-uo6j6ohcwkbj1h4p
parent: v.ladeuil+lp at free.fr-20071212154335-ac7fi089uufzbkfz
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: 175886
timestamp: Wed 2007-12-12 17:14:01 +0100
message:
Fix typo.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
------------------------------------------------------------
revno: 3104.3.1
revision-id:v.ladeuil+lp at free.fr-20071212154335-ac7fi089uufzbkfz
parent: pqm at pqm.ubuntu.com-20071211175118-s94sizduj201hrs5
committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
branch nick: 175886
timestamp: Wed 2007-12-12 16:43:35 +0100
message:
Fix #175886 by reading remaining bytes by chunks.
* bzrlib/transport/http/_urllib2_wrappers.py:
(Response.finish): Don't read all remaining bytes in one read or
some buffer may explode.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
=== modified file 'NEWS'
--- a/NEWS 2007-12-11 05:13:36 +0000
+++ b/NEWS 2007-12-12 16:14:01 +0000
@@ -25,6 +25,10 @@
removed use characters not supported in the terminal encoding.
(Aaron Bentley)
+ * When dumb http servers return whole files instead of the requested ranges,
+ read the remaining bytes by chunks to avoid overflowing network buffers.
+ (Vincent Ladeuil, #175886)
+
INTERNALS:
API BREAKS:
=== modified file 'bzrlib/tests/test_http_response.py'
--- a/bzrlib/tests/test_http_response.py 2007-12-10 10:41:24 +0000
+++ b/bzrlib/tests/test_http_response.py 2007-12-13 15:40:42 +0000
@@ -44,7 +44,56 @@
errors,
tests,
)
-from bzrlib.transport.http import response
+from bzrlib.transport.http import (
+ response,
+ _urllib2_wrappers,
+ )
+
+
+class ReadSocket(object):
+ """A socket-like object that can be given a predefined content."""
+
+ def __init__(self, data):
+ self.readfile = StringIO(data)
+
+ def makefile(self, mode='r', bufsize=None):
+ return self.readfile
+
+class FakeHTTPConnection(_urllib2_wrappers.HTTPConnection):
+
+ def __init__(self, sock):
+ _urllib2_wrappers.HTTPConnection.__init__(self, 'localhost')
+ # Set the socket to bypass the connection
+ self.sock = sock
+
+ def send(self, str):
+ """Ignores the writes on the socket."""
+ pass
+
+
+class TestHTTPConnection(tests.TestCase):
+
+ def test_cleanup_pipe(self):
+ sock = ReadSocket("""HTTP/1.1 200 OK\r
+Content-Type: text/plain; charset=UTF-8\r
+Content-Length: 18
+\r
+0123456789
+garbage""")
+ conn = FakeHTTPConnection(sock)
+ # Simulate the request sending so that the connection will be able to
+ # read the response.
+ conn.putrequest('GET', 'http://localhost/fictious')
+ conn.endheaders()
+ # Now, get the response
+ resp = conn.getresponse()
+ # Read part of the response
+ self.assertEquals('0123456789\n', resp.read(11))
+ # Override the thresold to force the warning emission
+ conn._range_warning_thresold = 6 # There are 7 bytes pending
+ conn.cleanup_pipe()
+ self.assertContainsRe(self._get_log(keep_log_file=True),
+ 'Got a 200 response when asking')
class TestRangeFileMixin(object):
=== modified file 'bzrlib/transport/http/__init__.py'
--- a/bzrlib/transport/http/__init__.py 2007-12-10 10:41:24 +0000
+++ b/bzrlib/transport/http/__init__.py 2007-12-12 21:06:05 +0000
@@ -139,7 +139,6 @@
# time of this writing it's even the only known client -- vila20071203
return StringIO(response_file.read())
- # TODO: Add tests for tail_amount or deprecate it
def _get(self, relpath, ranges, tail_amount=0):
"""Get a file, or part of a file.
=== modified file 'bzrlib/transport/http/_urllib2_wrappers.py'
--- a/bzrlib/transport/http/_urllib2_wrappers.py 2007-12-06 22:46:16 +0000
+++ b/bzrlib/transport/http/_urllib2_wrappers.py 2007-12-13 15:44:44 +0000
@@ -78,9 +78,6 @@
# Some responses have bodies in which we have no interest
_body_ignored_responses = [301,302, 303, 307, 401, 403, 404]
- def __init__(self, *args, **kwargs):
- httplib.HTTPResponse.__init__(self, *args, **kwargs)
-
def begin(self):
"""Begin to read the response from the server.
@@ -123,6 +120,12 @@
# below we keep the socket with the server opened.
self.will_close = False
+ # in finish() below, we may have to discard several MB in the worst
+ # case. To avoid buffering that much, we read and discard by chunks
+ # instead. The underlying file is either a socket or a StringIO, so reading
+ # 8k chunks should be fine.
+ _discarded_buf_size = 8192
+
def finish(self):
"""Finish reading the body.
@@ -131,11 +134,26 @@
persistent connection. If we don't use a persistent connection, well,
nothing will block the next request since a new connection will be
issued anyway.
+
+ :return: the number of bytes left on the socket (may be None)
"""
+ pending = None
if not self.isclosed():
# Make sure nothing was left to be read on the socket
- data = self.read(self.length)
+ pending = 0
+ while self.length and self.length > self._discarded_buf_size:
+ data = self.read(self._discarded_buf_size)
+ pending += len(data)
+ if self.length:
+ data = self.read(self.length)
+ pending += len(data)
+ if pending:
+ trace.mutter(
+ "bogus http server didn't give body length,"
+ "%s bytes left on the socket",
+ pending)
self.close()
+ return pending
# Not inheriting from 'object' because httplib.HTTPConnection doesn't.
@@ -145,13 +163,16 @@
response_class = Response
strict = 1 # We don't support HTTP/0.9
+ # When we detect a server responding with the whole file to range requests,
+ # we want to warn. But not below a given thresold.
+ _range_warning_thresold = 1024 * 1024
+
def __init__(self):
self._response = None
+ self._ranges_received_whole_file = None
def _mutter_connect(self):
- netloc = self.host
- if self.port is not None:
- netloc += ':%d' % self.port
+ netloc = '%s:%s' % (self.host, self.port)
if self.proxied_host is not None:
netloc += '(proxy for %s)' % self.proxied_host
trace.mutter('* About to connect() to %s' % netloc)
@@ -162,12 +183,19 @@
return self._response
def cleanup_pipe(self):
- """Make the connection believes the response have been fully handled.
-
- That makes the httplib.HTTPConnection happy
- """
+ """Make the connection believe the response has been fully processed."""
if self._response is not None:
- self._response.finish()
+ pending = self._response.finish()
+ # Warn the user (once)
+ if (self._ranges_received_whole_file is None
+ and self._response.status == 200
+ and pending and pending > self._range_warning_thresold
+ ):
+ self._ranges_received_whole_file = True
+ trace.warning(
+ 'Got a 200 response when asking for multiple ranges,'
+ ' does your server at %s:%s support range requests?',
+ self.host, self.port)
self._response = None
# Preserve our preciousss
sock = self.sock
More information about the bazaar-commits
mailing list