<HTML dir=ltr><HEAD><TITLE>[MERGE] Re: Accelerating push via bzr:// and friends</TITLE>
<META http-equiv=Content-Type content="text/html; charset=unicode">
<META content="MSHTML 6.00.6000.16640" name=GENERATOR></HEAD>
<BODY>
<DIV id=idOWAReplyText39075 dir=ltr>
<DIV dir=ltr><FONT face="Lucida Console" color=#000000 size=2><FONT size=2>
<P>Andrew Bennetts discussing SO_NODELAY (the TCP Nagel algorithm) wrote:</P>
<P>> We originally turned it on because it made things faster. The test suite<BR>> was noticeably slower without it, back when we were developing the initial<BR>> HPSS code. IIRC, without it we'd write a request to a socket, but then<BR>> find that the kernel would hang on to it for some number of milliseconds<BR>> before sending it, even though we were just waiting for a response. Our<BR>> first request is often quite small (e.g. the initial BzrDir.open request,<BR>> even with headers etc, is still only 82 bytes long in v3), so it's<BR>> understandable that the kernel might think it's worth waiting for more.<BR>> This situation is what the NODELAY flag is for, AIUI.</P>
<P>[..SNIP..]</P>
<P>> The problem I've seen with v3 when testing over a 500ms latency connection<BR>> is that the very first write to the socket is sent immediately, but then<BR>> later writes to send the rest of the request are delayed by 1000ms,<BR>> because the kernel waits to see the first ACK before sending any more<BR>> packets. There's no flag I can see to change this (and even if there<BR>> were, I doubt it would be portable).</P>
<P>This behavior is pretty much unavoidable as it is the mandated Slow Start<BR>congestion avoidance algorithm. It is a required part of any conforming<BR>TCP implementation. This places a premium in an RPC-environments on small<BR>messages that can fit in a unfragmented packet.</P>
<P>In any RPC-like scheme you probably want to disable the Nagel algorithm. A<BR>TCP connection is full duplex. There is no communication from the receive<BR>side to the transmit side that a client is waiting for input and that<BR>therefore any pending outbound data ought to be expedited. Having disabled<BR>Nagel you then want to submit complete application layer messages. That<BR>will lead to each message being sent in the smallest number of packets with<BR>the least latency, especially the final packet which might otherwise tarry.</P>
<P>Bulk data movement is qualitatively different. Here you want to take<BR>advantage of TCP's full duplex nature to keep the send side busy and process<BR>status messages coming back asynchronously. If the send side keeps pushing<BR>data into the socket every time some space opens up then all packets will be<BR>full MTUs. Your TCP connection provides reliability so status returns need<BR>only handle early termination (abort) and some form of heartbeat.</P>
<P>/john</P>
<P></FONT></FONT> </P></DIV></DIV></BODY></HTML>