[MERGE] Re: Accelerating push via bzr:// and friends

Sat May 24 16:36:44 BST 2008

Andrew Bennetts discussing SO_NODELAY (the TCP Nagel algorithm) wrote:

> We originally turned it on because it made things faster. The test suite
> was noticeably slower without it, back when we were developing the initial
> HPSS code. IIRC, without it we'd write a request to a socket, but then
> find that the kernel would hang on to it for some number of milliseconds
> before sending it, even though we were just waiting for a response. Our
> first request is often quite small (e.g. the initial BzrDir.open request,
> even with headers etc, is still only 82 bytes long in v3), so it's
> understandable that the kernel might think it's worth waiting for more.
> This situation is what the NODELAY flag is for, AIUI.

[..SNIP..]

> The problem I've seen with v3 when testing over a 500ms latency connection
> is that the very first write to the socket is sent immediately, but then
> later writes to send the rest of the request are delayed by 1000ms,
> because the kernel waits to see the first ACK before sending any more
> packets. There's no flag I can see to change this (and even if there
> were, I doubt it would be portable).

This behavior is pretty much unavoidable as it is the mandated Slow Start
congestion avoidance algorithm. It is a required part of any conforming
TCP implementation. This places a premium in an RPC-environments on small
messages that can fit in a unfragmented packet.

In any RPC-like scheme you probably want to disable the Nagel algorithm. A
TCP connection is full duplex. There is no communication from the receive
side to the transmit side that a client is waiting for input and that
therefore any pending outbound data ought to be expedited. Having disabled
Nagel you then want to submit complete application layer messages. That
will lead to each message being sent in the smallest number of packets with
the least latency, especially the final packet which might otherwise tarry.

Bulk data movement is qualitatively different. Here you want to take
advantage of TCP's full duplex nature to keep the send side busy and process
status messages coming back asynchronously. If the send side keeps pushing
data into the socket every time some space opens up then all packets will be
full MTUs. Your TCP connection provides reliability so status returns need
only handle early termination (abort) and some form of heartbeat.

/john

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/archives/bazaar/attachments/20080524/d4cb74de/attachment.htm