[step 11] request: 12 steps towards a high performance server

John Arbash Meinel john at arbash-meinel.com
Thu Sep 14 01:40:05 BST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Wed, 2006-09-13 at 14:46 -0500, John Arbash Meinel wrote:
>> Is the smart server intended to be more connectionless? (I think RPC
>> generally works that way). If it is, I guess that is okay. Though I
>> think I would rather it be more of a conversational server. Such that
>> you start with a handshake, and then talk back and forth from there. 
> 
> Some notes on the current state of things.
> 
>  * The protocol is versioned so that we can change it from release to
> release. At least in early days there is no plan to support multiple
> versions as once in either client or server. So we can address 'the
> protocol is wrong' post merge, and post-release if needed.

I think we'll want to address it fairly soon. Since you start running
into the same upgrade/watershed issues that can happen with directory
formats.

Which also makes sense for the handshake. Being stateless means that you
probably need to include version information in the request. So having
stuff like "GET 1 ./foo/bar", and "READV 1 ./foo/bar 10 20".

Also, one thing that would be really nice to have in the initial RPC
code, would be an advanced readv() implementation, similar to what http
implements.

sftp is currently slower that http, because we cannot request multiple
ranges. (And we are limited to 32K per request). paramiko 1.6.? adds the
ability to do these requests asynchronously, which helps (511s versus
450s), but http still does a lot better (318s).

With http, we can ask for several large ranges in a single request.

This specific implementation is actually going to be significantly
slower than plain sftp for any sort of partial operation (pull), because
 it actually downloads the entire content into a StringIO() which it
then seeks around in. (push should be faster, because we have a real
append function, which also makes it faster than http)

Oh, and I'm thinking that the default implementation of
'put_file_non_atomic' is probably going to be broken, because it will
read the file data, and try to create a remote knit, only to fail
because the parent dir doesn't exist. And when it retries the put, the
file contents have already been read.


>  * The protocol is deliberately stateless, so that it will operate over
> http with a new server instance on every request. This is important for
> both http compatability, and scalability.
> 
> So you can handshake and build up a larger request, but every request is
> standalone and self-contained.
> 
> -Rob

Sure.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFCKTlJdeBCYSNAAMRAtQVAKCovX+am5VV3W8Gbd+XIpdBCjPB0gCgxVcH
gdvOq6WQIzvm/Xd4rfGstVQ=
=iy6B
-----END PGP SIGNATURE-----




More information about the bazaar mailing list