Rev 2578: Fix #120678 by issuing a CONNECT request when https is used via a proxy in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Tue Jul 3 19:49:21 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2578
revision-id: pqm at pqm.ubuntu.com-20070703184919-6iz3vo2rx42smf25
parent: pqm at pqm.ubuntu.com-20070703083649-cbfyk0jt0itbktgb
parent: abentley at panoramicfeedback.com-20070703180034-zuiuqbo5mx0tsa1n
committer: Canonical.com Patch Queue Manager<pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Tue 2007-07-03 19:49:19 +0100
message:
  Fix #120678 by issuing a CONNECT request when https is used via a proxy
modified:
  NEWS                           NEWS-20050323055033-4e00b5db738777ff
  bzrlib/transport/http/_urllib2_wrappers.py _urllib2_wrappers.py-20060913231729-ha9ugi48ktx481ao-1
    ------------------------------------------------------------
    revno: 2540.2.4
    merged: abentley at panoramicfeedback.com-20070703180034-zuiuqbo5mx0tsa1n
    parent: v.ladeuil+lp at free.fr-20070703152901-o25i9ntsv1jj1m4h
    parent: pqm at pqm.ubuntu.com-20070703083649-cbfyk0jt0itbktgb
    committer: Aaron Bentley <abentley at panoramicfeedback.com>
    branch nick: Aaron's integration
    timestamp: Tue 2007-07-03 14:00:34 -0400
    message:
      Merge from bzr.dev
    ------------------------------------------------------------
    revno: 2540.2.3
    merged: v.ladeuil+lp at free.fr-20070703152901-o25i9ntsv1jj1m4h
    parent: v.ladeuil+lp at free.fr-20070626091244-nou3usgr34947wz9
    committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
    branch nick: 120678
    timestamp: Tue 2007-07-03 17:29:01 +0200
    message:
      Take Aaron's comments into account.
    ------------------------------------------------------------
    revno: 2540.2.2
    merged: v.ladeuil+lp at free.fr-20070626091244-nou3usgr34947wz9
    parent: v.ladeuil+lp at free.fr-20070622075602-1kk9gdry92v2usq3
    committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
    branch nick: 120678
    timestamp: Tue 2007-06-26 11:12:44 +0200
    message:
      Fix #120678 by issuing a CONNECT request when https is used via a proxy.
      
      * bzrlib/transport/http/_urllib2_wrappers.py:
      (_ConectRequest): Make the class private and better document its
      purpose.
      (AbstractAuthHandler): Limit the attempts to
      authenticate. Otherwise, in the very special case of CONNECT, if
      the credentials are wrong we loop recursively until the stack
      explodes.
    ------------------------------------------------------------
    revno: 2540.2.1
    merged: v.ladeuil+lp at free.fr-20070622075602-1kk9gdry92v2usq3
    parent: pqm at pqm.ubuntu.com-20070620092141-cniojlk01bdec2a1
    committer: Vincent Ladeuil <v.ladeuil+lp at free.fr>
    branch nick: 120678
    timestamp: Fri 2007-06-22 09:56:02 +0200
    message:
      Rough, working, tested against squid+apache in basic auth fix for #120678
      
      * bzrlib/transport/http/_urllib2_wrappers.py:
      (Response.begin): Don't close the connection after a successful
      CONNECT (grr).
      (HTTPConnection.__init__, HTTPSConnection.__init__): Add a
      proxied_host parameter. This is ugly and should be refactored.
      (HTTPSConnection.connect, HTTPSConnection.connect_to_origin):
      Split the httplib.HTTPSConnection so that we can issue a CONNECT
      request *before* swithing the socket to ssl mode.
      (Request.set_proxy): Trap the proxy setting to preserve the
      original host.
      (ConnectRequest): New specific request.
      (ConnectRequest.set_proxy): Trap the proxy setting again to avoid
      losing the original host (ugly).
      (HTTPSHandler.https_request): Wow, we really need https tests,
      https requests common headers were never set :-/
      (HTTPSHandler.https_open): When connecting through a proxy, the
      first request should be a CONNECT to establish the encrypted link.
=== modified file 'NEWS'
--- a/NEWS	2007-07-03 08:36:49 +0000
+++ b/NEWS	2007-07-03 18:00:34 +0000
@@ -12,6 +12,14 @@
     * Commands that use status flags now have a reference to 'help
       status-flags'.  (Daniel Watkins, #113436)
 
+    * Work around python-2.4.1 inhability to correctly parse the
+      authentication header.
+      (Vincent Ladeuil, #121889)
+
+    * Issue a CONNECT request when connecting to an https server
+      via a proxy to enable SSL tunneling.
+     (Vincent Ladeuil, #120678)
+
   IMPROVEMENTS:
 
     * The --lsprof-file option now dumps a text rendering of the profiling
@@ -133,12 +141,6 @@
       avoid useless cloning.
       (Vincent Ladeuil, #110448)
 
-  BUGFIXES:
-
-    * Work around python-2.4.1 inhability to correctly parse the
-      authentication header.
-      (Vincent Ladeuil, #121889)
-
 
 bzr 0.17  2007-06-18
 

=== modified file 'bzrlib/transport/http/_urllib2_wrappers.py'
--- a/bzrlib/transport/http/_urllib2_wrappers.py	2007-07-03 07:03:32 +0000
+++ b/bzrlib/transport/http/_urllib2_wrappers.py	2007-07-03 18:00:34 +0000
@@ -108,6 +108,13 @@
                 if self.debuglevel > 0:
                     print "Consumed body: [%s]" % body
             self.close()
+        elif self.status == 200:
+            # Whatever the request is, it went ok, so we surely don't want to
+            # close the connection. Some cases are not correctly detected by
+            # httplib.HTTPConnection.getresponse (called by
+            # httplib.HTTPResponse.begin). The CONNECT response for the https
+            # through proxy case is one.
+            self.will_close = False
 
 
 # Not inheriting from 'object' because httplib.HTTPConnection doesn't.
@@ -125,16 +132,36 @@
         # Preserve our preciousss
         sock = self.sock
         self.sock = None
+        # Let httplib.HTTPConnection do its housekeeping 
         self.close()
+        # Restore our preciousss
         self.sock = sock
 
 
 class HTTPConnection(AbstractHTTPConnection, httplib.HTTPConnection):
-    pass
+
+    # XXX: Needs refactoring at the caller level.
+    def __init__(self, host, port=None, strict=None, proxied_host=None):
+        httplib.HTTPConnection.__init__(self, host, port, strict)
+        self.proxied_host = proxied_host
 
 
 class HTTPSConnection(AbstractHTTPConnection, httplib.HTTPSConnection):
-    pass
+
+    def __init__(self, host, port=None, key_file=None, cert_file=None,
+                 strict=None, proxied_host=None):
+        httplib.HTTPSConnection.__init__(self, host, port,
+                                         key_file, cert_file, strict)
+        self.proxied_host = proxied_host
+
+    def connect(self):
+        httplib.HTTPConnection.connect(self)
+        if self.proxied_host is None:
+            self.connect_to_origin()
+
+    def connect_to_origin(self):
+        ssl = socket.ssl(self.sock, self.key_file, self.cert_file)
+        self.sock = httplib.FakeSocket(self.sock, ssl)
 
 
 class Request(urllib2.Request):
@@ -171,10 +198,47 @@
         # Some authentication schemes may add more entries.
         self.auth = {}
         self.proxy_auth = {}
+        self.proxied_host = None
 
     def get_method(self):
         return self.method
 
+    def set_proxy(self, proxy, type):
+        """Set the proxy and remember the proxied host."""
+        self.proxied_host = self.get_host()
+        urllib2.Request.set_proxy(self, proxy, type)
+
+
+class _ConnectRequest(Request):
+
+    def __init__(self, request):
+        """Constructor
+        
+        :param request: the first request sent to the proxied host, already
+            processed by the opener (i.e. proxied_host is already set).
+        """
+        # We give a fake url and redefine get_selector or urllib2 will be
+        # confused
+        Request.__init__(self, 'CONNECT', request.get_full_url(),
+                         connection=request.connection)
+        assert request.proxied_host is not None
+        self.proxied_host = request.proxied_host
+
+    def get_selector(self):
+        return self.proxied_host
+
+    def set_proxy(self, proxy, type):
+        """Set the proxy without remembering the proxied host.
+
+        We already know the proxied host by definition, the CONNECT request
+        occurs only when the connection goes through a proxy. The usual
+        processing (masquerade the request so that the connection is done to
+        the proxy while the request is targeted at another host) does not apply
+        here. In fact, the connection is already established with proxy and we
+        just want to enable the SSL tunneling.
+        """
+        urllib2.Request.set_proxy(self, proxy, type)
+
 
 def extract_credentials(url):
     """Extracts credentials information from url.
@@ -215,10 +279,9 @@
     return '%s://%s' % (scheme, host)
 
 
-# The urlib2.xxxAuthHandler handle the authentication of the
-# requests, to do that, they need an urllib2 PasswordManager *at
-# build time*. We also need one to reuse the passwords entered by
-# the user.
+# The AuthHandler classes handle the authentication of the requests, to do
+# that, they need a PasswordManager *at build time*. We also need one to reuse
+# the passwords entered by the user.
 class PasswordManager(urllib2.HTTPPasswordMgrWithDefaultRealm):
 
     def __init__(self):
@@ -244,9 +307,11 @@
             # handled in the higher levels
             raise errors.InvalidURL(request.get_full_url(), 'no host given.')
 
-        # We create a connection (but it will not connect yet)
+        # We create a connection (but it will not connect until the first
+        # request is made)
         try:
-            connection = http_connection_class(host)
+            connection = http_connection_class(
+                host, proxied_host=request.proxied_host)
         except httplib.InvalidURL, exception:
             # There is only one occurrence of InvalidURL in httplib
             raise errors.InvalidURL(request.get_full_url(),
@@ -369,12 +434,11 @@
                 else:
                     # All other exception are considered connection related.
 
-                    # httplib.HTTPException should indicate a bug
-                    # in the urllib implementation, somewhow the
-                    # httplib pipeline is in an incorrect state,
-                    # we retry in hope that this will correct the
-                    # problem but that may need investigation
-                    # (note that no such bug is known as of
+                    # httplib.HTTPException should indicate a bug in our
+                    # urllib-based implementation, somewhow the httplib
+                    # pipeline is in an incorrect state, we retry in hope that
+                    # this will correct the problem but that may need
+                    # investigation (note that no such bug is known as of
                     # 20061005 --vila).
 
                     # socket errors generally occurs for reasons
@@ -385,7 +449,6 @@
                     # FIXME: and then there is HTTPError raised by:
                     # - HTTPDefaultErrorHandler (we define our own)
                     # - HTTPRedirectHandler.redirect_request 
-                    # - AbstractDigestAuthHandler.http_error_auth_reqed
 
                     my_exception = errors.ConnectionError(
                         msg= 'while sending %s %s:' % (request.get_method(),
@@ -500,10 +563,41 @@
 class HTTPSHandler(AbstractHTTPHandler):
     """A custom handler that just thunks into HTTPSConnection"""
 
+    https_request = AbstractHTTPHandler.http_request
+
     def https_open(self, request):
+        connection = request.connection
+        assert isinstance(connection, HTTPSConnection)
+        if connection.sock is None and \
+                connection.proxied_host is not None and \
+                request.get_method() != 'CONNECT' : # Don't loop
+            # FIXME: We need a gazillion connection tests here, but we still
+            # miss a https server :-( :
+            # - with and without proxy
+            # - with and without certificate
+            # - with self-signed certificate
+            # - with and without authentication
+            # - with good and bad credentials (especially the proxy auth aound
+            #   CONNECT)
+            # - with basic and digest schemes
+            # - reconnection on errors
+            # - connection persistence behaviour (including reconnection)
+
+            # We are about to connect for the first time via a proxy, we must
+            # issue a CONNECT request first to establish the encrypted link
+            connect = _ConnectRequest(request)
+            response = self.parent.open(connect)
+            if response.code != 200:
+                raise ConnectionError("Can't connect to %s via proxy %s" % (
+                        connect.proxied_host, self.host))
+            # Housekeeping
+            connection.fake_close()
+            # Establish the connection encryption 
+            connection.connect_to_origin()
+            # Propagate the connection to the original request
+            request.connection = connection
         return self.do_open(HTTPSConnection, request)
 
-
 class HTTPRedirectHandler(urllib2.HTTPRedirectHandler):
     """Handles redirect requests.
 
@@ -625,17 +719,18 @@
 class ProxyHandler(urllib2.ProxyHandler):
     """Handles proxy setting.
 
-    Copied and modified from urllib2 to be able to modify the
-    request during the request pre-processing instead of
-    modifying it at _open time. As we capture (or create) the
-    connection object during request processing, _open time was
-    too late.
-
-    Note that the proxy handling *may* modify the protocol used;
-    the request may be against an https server proxied through an
-    http proxy. So, https_request will be called, but later it's
-    really http_open that will be called. This explain why we
-    don't have to call self.parent.open as the urllib2 did.
+    Copied and modified from urllib2 to be able to modify the request during
+    the request pre-processing instead of modifying it at _open time. As we
+    capture (or create) the connection object during request processing, _open
+    time was too late.
+
+    The main task is to modify the request so that the connection is done to
+    the proxy while the request still refers to the destination host.
+
+    Note: the proxy handling *may* modify the protocol used; the request may be
+    against an https server proxied through an http proxy. So, https_request
+    will be called, but later it's really http_open that will be called. This
+    explains why we don't have to call self.parent.open as the urllib2 did.
     """
 
     # Proxies must be in front
@@ -715,6 +810,10 @@
         proxy = self.get_proxy_env_var(type)
         if self._debuglevel > 0:
             print 'set_proxy %s_request for %r' % (type, proxy)
+        # FIXME: python 2.5 urlparse provides a better _parse_proxy which can
+        # grok user:password at host:port as well as
+        # http://user:password@host:port
+
         # Extract credentials from the url and store them in the
         # password manager so that the proxy AuthHandler can use
         # them later.
@@ -783,6 +882,9 @@
       successful and the request authentication parameters have been updated.
     """
 
+    _max_retry = 2
+    """We don't want to retry authenticating endlessly"""
+
     # The following attributes should be defined by daughter
     # classes:
     # - auth_required_header:  the header received from the server
@@ -792,6 +894,7 @@
         self.password_manager = password_manager
         self.find_user_password = password_manager.find_user_password
         self.add_password = password_manager.add_password
+        self._retry_count = None
 
     def update_auth(self, auth, key, value):
         """Update a value in auth marking the auth as modified if needed"""
@@ -807,6 +910,16 @@
         :param headers: The headers for the authentication error response.
         :return: None or the response for the authenticated request.
         """
+        # Don't try  to authenticate endlessly
+        if self._retry_count is None:
+            # The retry being recusrsive calls, None identify the first try
+            self._retry_count = 0
+        else:
+            self._retry_count += 1
+            if self._retry_count > self._max_retry:
+                # Let's be ready for next round
+                self._retry_count = None
+                return None
         server_header = headers.get(self.auth_required_header, None)
         if server_header is None:
             # The http error MUST have the associated
@@ -882,7 +995,8 @@
         :param request: The succesfully authenticated request.
         :param response: The server response (may contain auth info).
         """
-        pass
+        # It may happen that we need to reconnect later, let's be ready
+        self._retry_count = None
 
     def get_password(self, user, authuri, realm=None):
         """Ask user for a password if none is already available."""




More information about the bazaar-commits mailing list