[xBBxMERGE] Fix test suite regression on OSX in TestReadMergeableFromUrl.test_smart_server_connection_reset.

Andrew Bennetts andrew at canonical.com
Fri Oct 3 01:10:11 BST 2008


Vincent Ladeuil wrote:
> >>>>> "Andrew" == Andrew Bennetts <andrew at canonical.com> writes:
> 
>     Andrew> Andrew Bennetts has voted tweak.
>     Andrew> Status is now: Conditionally approved
>     Andrew> Comment:
>     Andrew> A test for this would be nice.
> 
> Reminder: this test failed on OS X and prompted me to propose
> this fix :-)

Sure.  I should have said (as I did on IRC) that a test that fails
reliably on all/most platforms would be nice.

Or at least, on the platform I use ;)

A test that fails intermittently on one platform is better than nothing,
though.

>     Andrew> I think you could probably make a good one using the
>     Andrew> _DisconnectingTCPServer in test_bundle.
> 
> I think so too :)
> 
> But as said on IRC, I will look into doing the same for other
> smart mediums.
> 
> ...
> 
> Well, that doesn't work as expected to say the least. Roughly
> speaking the model you used for _DisconnectingTCPServer doesn't
> work for http (even if test_http.WallServer existed and provides
> the same service than _DisconnectingTCPServer), since errors are
> handled in two different phases:
> - the request headers handling (i.e. during get()),
> - the body reading handling (i.e during subsequent read()).

Right, this is why I was saying on IRC that these tests will need to be
unique to each medium implementation, because the details of what
“connection interrupted/finished” means is very dependent on the medium.

> The equivalent phases are handled the same way for bzr over TCP
> since all bytes are read via read_bytes.
> 
> But for http, the connection reset is already either a
> ConnectionError or an InvalidHttpResponse in the headers phase
> where read_bytes is not involved at all.
> 
> So I'm back at my earlier question (in what circumstances is the
> smart server involved in reading a *bundle*) but waiting for a
> higher level answer to understand why it is used that way.

Short answer: It's involved in reading a bundle when the user asks bzr
to read one from a smart server :)

The very first thing that cmd_pull and cmd_merge try to do on the remote
location is read_mergeable_from_url.  If the URL is for an HPSS
transport (i.e. its bzr:// or bzr+ssh://, and maybe also bzr+http://),
then this means doing a “get” RPC, and because this is first HPSS call
then if the connection to that URL fails then read_mergeable_from_url is
where that connection failure will happen.
https://bugs.edge.launchpad.net/bzr/+bug/246233 is a relevant bug
report.

> Because it seems to me that special casing ConnectionReset like
> you did will lead to more collisions down the road.
> 
> I'll merge the previous fix to avoid OSX support bitrot but I
> think we should keep discussing and complete that fix.
> 
> I think there is more than appears behind this collision between
> hpss and http:
> 
> - http lacks some error handling in the body reading phase (but
>   since it provides a file-like for the body to upper layers, it
>   need help from upper layers),

Right; any request that returns results incrementally can fail not just
on the first part of the response but any part of the response.
Returning a file-like object is an example of this.

> - we may need to define more specific (regarding causes) and more
>   generic (regarding the medium involved, be it http, ssh, or raw
>   TCP) exceptions to unify the error handling (the except clauses
>   in bundle.read_mergeable_from_transport start to become
>   impressive, speaking of which jam made a comment there
>   regarding sftp which is close to the problem I just encountered
>   (exception during get() or exception during read()).

So I think the three basic events that matter for HPSS media, at least
for stream-based media, are:

  * connection could not be established
  * connection was interrupted (i.e. closed prematurely)
  * connection finished cleanly

We currently treat both of the first two as ConnectionReset, and I think
that's a reasonable simplification: either way the connection failed,
and we should give up and tell the user about it.  Do you think that is
inadequate?

On the topic of the except clauses in read_mergeable_from_transport,
perhaps the problem there is that ConnectionReset is in the wrong place
in our exception hierarchy.  If it wasn't a TransportError, then it
wouldn't need special handling.

-Andrew.




More information about the bazaar mailing list