[MERGE] Re: Accelerating push via bzr:// and friends

Wed May 21 04:52:30 BST 2008

Martin Pool wrote:
[...]
> This seems like a good match for the situation you describe above.
> But setting NODELAY is just kind of hiding the problem, and not
> totally effectively: we will send a stream of small packets when what
> we really want is just to send the whole thing in one go.  And it
> seems like the reliable way to do that is to do just a large send, as
> you said.  If we've done that I wonder if nodelay will still have any
> effect.

Yeah, I think with a single send we should be fine without NODELAY.  I think we
probably should leave it set though, because it did make a difference for v1
(and presumably v2).  It would be a shame to needlessly regress the performance
when interoperating with older clients/servers.

[...]
> > We are actually using socket objects directly when talking over TCP.
> 
> We should try to make sure that we also do large writes to the pipe to
> the child process when using ssh to give it the best chance at also
> sending large packets.

Agreed.

[...]
> > The attached patch implements appropriate buffering, and adds tests for that
> > buffering.  In my testing this brings a push of one revision of bzr.dev over
> > 500ms latency TCP down from 3m 19s to 1m 23s.
> 
> Wow, nice result!
> 
> This seems to do unboundedly large buffering, which I think is too
> much.  In extreme cases we may run out of memory, but even at smaller
> layers send() may refuse to accept messages which are too big to
> atomically pass to the stack.  I think buffering say 64 or 128k will
> be enough to avoid small packets.  Also, (though this may not be a
> problem at present) if we have a large body we do want to start
> transmitting as soon as we can.  Ideally you would also test for this.

We're already insulated from dangerously large send()s because we use
osutils.send_all, which breaks up large writes.

If something sends a large monolithic body, then the memory consumption will be
high.  On the other hand, that's going to be true already.  The solution is to
not hold very large strings in memory to start with, I think.  We could
definitely do better, but this isn't a new problem with this patch: even
bzrlib.transport.remote.RemoteTransport.put_file simply reads the whole file
into memory and then does self.put_bytes.

I expect we'll eventually add a variant of Requester.call that accepts a
file-like object rather than a string.  Similarly we might extend the API for
sending body streams to allow a chunk to be a file-like object instead of a
string.  It doesn't seem to be one of our bottlenecks at the moment though, so
I'm happy to defer tackling this until more severe performance problems are
fixed.

> +class StubMediumRequest(object):
> +    """A stub medium request that tracks the number of times accept_bytes is
> +    called.
> +    """
> +    def __init__(self):
> +        self.calls = []
> +    def accept_bytes(self, bytes):
> +        self.calls.append('accept_bytes')
> +    def finished_writing(self):
> +        self.calls.append('finished_writing')
> +
> 
> PEP8 :)

Oops, I originally wrote this inline in a test method.  Thanks.

> Thanks for adding tests for this, I know it would have been tempting
> to just rely on your own measurements.

I was particularly happy with the TestResponseEncoderBufferingProtocolThree test
case.  It's nice when it turns out code is already amenable to unanticipated
tests.

-Andrew.