[MERGE] Return (and expect) '3' rather than '2' from the 'hello' verb (protocol v3 patch 4/7)

Wed May 14 12:31:49 BST 2008

John Arbash Meinel wrote:
> Andrew Bennetts wrote:
> | John Arbash Meinel wrote:
[...]
> |> My concern with this is the problem we had with the "call_with_body_bytes" for
> |> unknown protocols. Where calling the server with an unknown method made it start
[...]
> |
> | It's not the same problem.  Since V2, we always send a newline-terminated
[...]
> |
> | It is precisely for this reason (so that it is possible and simple for a
> | recipient to reliably know if the message is encoded in a way the recipient can
> | understand) that we send the protocol markers.
>
> Well, there is a further statement about how it will handle when a request is
> unknown. I think the answer is just disconnect.
> *If* it starts with "MESSAGE_V3" and then the actual request is unknown, I think
> the server can safely skip it. We have taken the precaution of defining the
> bytes-on-the-wire with length prefixes, etc, so that the server should now how
> much it needs to consume before it can be safe to read more. (I'm curious about
> security implications here. mostly of the DoS/overload memory/spin on 50 empty
> sockets, waiting for MESSAGE_V???, or that last character)

The security implications here should be pretty small: an attacker sending an
incomplete version marker will cause a little bit of memory to be consumed (for
a few python objects for the socket and protocol decoder), but no other
resources.  Perhaps an idle timeout would be a good idea, though.

A more serious security concern is sending a very large byte buffer: it'd be
pretty easy to DoS the server by constructing a request with a very large
“bytes” message part (i.e. send a length-prefix of 0xffffffff and contents of
4294967284:XXXXXXXXX...).  So a client can trivially consume as much memory on
the server as fast as it can send bytes to that server (similar things are
possible with V1 and V2 as well).  Probably people deploying public smart
servers should use ulimits or similar to cap the memory consumption so that an
attacker can only DoS one process, rather than an entire system.

> Anyway, I think it should be clearly documented how the server needs to handle
> unknown requests, or we *can* get into the same situation again. I *think*
> you've done this for V2 and V3, though I can't say I explicitly read a
> ".disconnect() line". I guess maybe you raise an exception?

Right.  And bzrlib/smart/client.py catches the exception and does the
disconnect.  (The exception never happens on the server-side, because
expect_version_marker=False there; and otherwise unrecognised requests will be
interpreted as version 1.)

[...]
> |> ^- I think this should probably be called "self._default_headers" to make it
> |> clear that this is the suggested headers to be sent, and not the absolute ones.
> |
> | I don't understand.  These are *the* headers that will be used, so long as the
> | protocol encoding supports headers.  “_default_headers” sounds like headers that
> | might be ignored and others sent instead.
>
> I guess I didn't realize this object was only used 1 time. It certainly seems
> like a fairly heavy object to be constructing over and over again.

The decoder?  Why do you say it's heavy?  It has <10 instance variables (and I
can probably trim those if we really needed), and compared to the cost of
executing a request like get-data-stream I doubt the initialisation cost is
significant.  We are trying to reduce the number of requests we make, after all,
so the cost here should be increasingly unimportant.

[...]
> |> ^- There seems to be a lot of duplicated code between the "_protocol_version is
> |> not None" if statement, and the following loop. Could you abstract this into a
> |> "_call_and_read_response(protocol_version = XXX) instead?
> |> I know it might mean trapping UnexpectedProtocolVersionMarker all the time, but
> |> you could do something like:
> |>
> |> except errors.UnexpectedProtocolVersionMarker:
> |> ~  if self._protocol_version is not None:
> |> ~    # This server already claimed to support this version
> |> ~    raise
> |>
> |> IMO it would be cleaner.
> |
> | Hmm.  That means combining the "we know the version, so just send" logic with the
> | "we're trying to autodetect".  Something about that feels quite wrong to me.
>
> Well, there was always my:
>
> if self._protocol_version is not None:
> ~  # we've finished autodetecting, abort because the server must have lied:
> ~  raise
>
> Which should still cleanly handle the "I'm autodetecting now" from the "I'm just
> processing messages". And not have any code duplication.

That still spread the autodetection across two methods rather than one.  The
code duplication here is pretty trivial, I don't think it's worth worrying
about.

[...]
> |> ^- This seems like it should have some amount of knowledge whether
> |> MESSAGE_VERSION_THREE is actually the returned message. Or maybe that just needs
> |> to happen when MESSAGE_VERSION_FOUR comes out.
> |
> | I'm not sure what you mean by this.  This flag is needed on the decoder because
> | the basic decoder is the same for requests and responses, but:
> |
> |  - on the server side, the protocol marker is already consumed because the
> |    server needs to read it first to know which decoder to use.
> |  - on the client side, the protocol marker is not consumed, because the client
> |    logic assumes the response must be in the same version as the request.
> |
> | So the logic is almost the same, except for the initial state, hence the flag.
> |
>
> Ok. There are 2 parts here.
>
> 1) Client versus server, the flag needs to be present not present
>
> 2) When the flag is present, you assume that you are expecting
> MESSAGE_VERSION_THREE. I might have misunderstood where this code was. I thought
> it was generic between V2 and V3 (and a theoretical V4). It may be that this is
> generic only between V3 request and respond.
>
> It sounds like it is my confusion, not yours. But I wanted to make my point
> clear, in case I was right.

It's just common to V3 requests and responses.  So it looks like you were just
confused initially.  Is there something I can do to make this clearer in my
code?  The class is called ProtocolThreeDecoder, which seems clear to my eyes,
but I wrote the code so I'm not a great judge of its clarity :)

> |> MockMedium seems fine. I sort of wondered if we could get away from bytes on the
> |> wire, but I realized that we really do (at least at certain points) want to test
> |> bytes on the wire for these things.
> |> I don't think it is necessarily the best way to test falling back to V2 from a
> |> V3 server, but it is ok.
> |>
> |> Certainly I like to see bytes-on-the-wire tests because they help ensure
> |> compatibility across releases.
> |
> | Right.  I'm certainly open to suggestions for better tests!  But this is the
> | best I've thought of for these tests so far, and I'm pretty happy with it.  For
> | these tests bytes-on-the-wire does seem most appropriate to me.
>
> Well, for the auto-retry specifically, it seemed like testing the logical "try
> v3 then fall back to try v2" would be a better test of the concept.
> bytes-on-the-wire isn't bad, though, since this is a fairly critical stage. And
> certainly, I don't expect there to be any need to change the on-the-wire bytes.
> (We won't be evolving the V2 server anymore, etc.)

Right, often hard-coded serialised data in tests is bad because serialisation
formats change, which causes the tests to be burden on what ought to be
unrelated code.  In this case bytes-on-the-wire doesn't have that problem, as we
want to interoperate with those serialisations indefinitely.   In fact, writing
the tests like this allows the test to keep functioning unchanged even if we
drop the server-side support for old versions while keeping client-side support,
which I think is fairly likely.

Thanks very much for the careful review.

-Andrew.