[MERGE] Re: Accelerating push via bzr:// and friends

Wed May 21 14:22:28 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
| Martin Pool wrote:
|> On Tue, May 20, 2008 at 8:10 PM, Robert Collins
|> <robertc at robertcollins.net> wrote:
| [...]
|>> NODELAY stops the kernel buffering, so you get lots of packets on the
|>> wire; otoh it means when we are being chatty we don't have to wait.
|>> Perhaps we don't want NODELAY?
|> SO_NODELAY does not afaik change any behaviour to do with waiting for
|> acks, but only as Robert indicates changes transmission buffering.  I
|> don't think we would want it on in our case.
|
| We originally turned it on because it made things faster.  The test suite was
| noticeably slower without it, back when we were developing the initial HPSS
| code.  IIRC, without it we'd write a request to a socket, but then find that the
| kernel would hang on to it for some number of milliseconds before sending it,
| even though we were just waiting for a response.  Our first request is often
| quite small (e.g. the initial BzrDir.open request, even with headers etc, is
| still only 82 bytes long in v3), so it's understandable that the kernel might
| think it's worth waiting for more.  This situation is what the NODELAY flag is
| for, AIUI.
|
| We aren't ever as pathological as e.g. a user typing in a telnet session.  We
| sometimes wrote requests/responses in multiple calls because that was the most
| convenient thing to do.  In theory the network overhead of 3 TCP packets vs 1
| packet should be trivial, so the there's no downside to us using NODELAY.  So I
| think we should keep setting NODELAY.
|
| The problem I've seen with v3 when testing over a 500ms latency connection is
| that the very first write to the socket is sent immediately, but then later
| writes to send the rest of the request are delayed by 1000ms, because the kernel
| waits to see the first ACK before sending any more packets.  There's no flag I
| can see to change this (and even if there were, I doubt it would be portable).
|
| So I guess the only portable way to perform optimally here is to only call
| send(2) once, which at least has the advantage of being a simple to explain
| strategy. :)
|
|> I think we are using Python file objects to send and receive, so
|> rather than buffering ourself we should make sure their buffering is
|> turned on, and then just flush them when we have finished sending the
|> message.
|
| We are actually using socket objects directly when talking over TCP.
|
| Also, just flushing at the end of a message isn't sufficient if there's a body
| stream.  So we need to be flushing after each chunk of a body stream as well as
| at the end of a message.
|
| The attached patch implements appropriate buffering, and adds tests for that
| buffering.  In my testing this brings a push of one revision of bzr.dev over
| 500ms latency TCP down from 3m 19s to 1m 23s.
|
| -Andrew.
|
|

Any chance you could test this with a real connection, and make sure it
continues to work? I'm happy to set up some bzr+ssh or bzr:// ports for you to
connect to on my home machine, which should be enough delay to give meaningful
results.

Otherwise:

BB:approve

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkg0IhQACgkQJdeBCYSNAAM99ACgiPKfkd81dK97M4AXk9NHCKt/
exUAnAoeDZybxiX8pwEf+u2/KoUVjmGU
=xY9V
-----END PGP SIGNATURE-----