[MERGE] clean error msg on bzr+ssh timeouts (#115601)

Andrew Bennetts andrew at canonical.com
Thu Oct 25 08:16:21 BST 2007


bb:tweak

Ian Clatworthy wrote:
> When trying to access a branch via bzr+ssh that is inaccessible, the
> code dumps a traceback instead of giving a clean error message. This has
> been reported several times now in various forms - #115601 has 3
> duplicates registered. This change fixes those bugs.

Hooray!

> I'm pretty sure the code is fine though I'm less sure about the tests.
> The tests do detect the 2 scenarios - disconnected during startup and
> disconnected later - but perhaps they ought to be done at the different
> level, e.g. up higher? There are a heap of smart servers tests so there
> could well be much better ways to test this. If so, please speak up.
> Even if these tests are acceptable, I'm keen to learn the better way if
> it exists for next time.

One comment is that the tests are incomplete: the intent (although I see now
this is not at all obvious) is that tests added to TestSmartProtocolOne should
generally also be added to TestSmartProtocolTwo.

[...]
> +class ConnectionErrorOnStartUp(TransportError):
> +
> +    _fmt = "Disconnected from server during negotiation: %(msg)s %(orig_error)s"

“ConnectionFailed” might be a simpler name for this.

> === modified file 'bzrlib/smart/protocol.py'
> --- bzrlib/smart/protocol.py	2007-08-13 01:04:05 +0000
> +++ bzrlib/smart/protocol.py	2007-10-24 04:26:21 +0000
> @@ -399,8 +399,16 @@
>          while not line or line[-1] != '\n':
>              # TODO: this is inefficient - but tuples are short.
>              new_char = self._request.read_bytes(1)
> +            if new_char == '':
> +                # end of file encountered reading from server
> +                if len(line) == 0:
> +                    # disconnected before any data read
> +                    raise errors.ConnectionErrorOnStartUp(
> +                        "please check connectivity and permissions")

This situation does not necessarily mean that this is during “start up” (which
isn't a defined term here) or even that this is due to the connection failing.
This could happen if the first request on pipe/socket succeeds, but then the
connection is closed before the second one is answered.

> +                else:
> +                    # disconnected after some data received
> +                    raise errors.ConnectionReset("connection timeout")

I'd simply call both of these cases “Connection closed”, or perhaps
“Connection closed unexpectedly”.  If the user wants to know more about exactly
at which point the conversation is being broken, perhaps they should be using
-Dhpss.

I think if we want to distinguish between “could not establish connection” and
“connection closed in general” then we should do it at a slightly different
layer.  (I think the issue here is when running SSH in a subprocess we fail to
notice connection errors until we try to read bytes from the subprocess pipe,
whereas with the Paramiko implementation it will raise a more useful exception
at connect time, well before read_bytes is involved.  Probably this isn't worth
the error to fix, as the SSH subprocess should have delivered relevant errors to
the user on stderr already.)

> === modified file 'bzrlib/tests/test_smart_transport.py'
> --- bzrlib/tests/test_smart_transport.py	2007-10-12 05:26:46 +0000
> +++ bzrlib/tests/test_smart_transport.py	2007-10-24 04:26:21 +0000
> @@ -1424,6 +1424,26 @@
>          self.assertOffsetSerialisation([(1,2), (3,4), (100, 200)],
>              '1,2\n3,4\n100,200', self.client_protocol)
>  
> +    def test_connection_error_on_startup(self):
[...]
> +    def test_connection_error_during_line(self):
[...]

As I mention above, these tests need to be added to TestSmartProtocolTwo as
well.  You can probably do this best by using self.client_protocol_class rather
than protocol.SmartClientRequestProtocolOne, and moving these into
CommonSmartProtocolTestMixin.

These tests are good, but they highlight another deficiency: there's no tests I
can see for a connection being closed during read_body_bytes.  If it's not too
hard, it would be great to fix that while you're here (you'll probably find the
build_protocol_waiting_for_body helper method useful for writing that test).

I think what you've fixed is more than worthwhile on its own, so if you'd prefer
I'd be fine with filing a bug about testing (and possibly improving) the
behaviour if a connection drops out during read_body_bytes.  Errors during
read_response_tuple are much more common I expect (because that's where DNS
errors, authentication failures, etc will make themselves known).

-Andrew.




More information about the bazaar mailing list