[step 11] request: 12 steps towards a high performance server
John Arbash Meinel
john at arbash-meinel.com
Fri Sep 15 17:03:02 BST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> On Wed, Sep 13, 2006 at 02:46:30PM -0500, John Arbash Meinel wrote:
...
>> ^- So the opening request is a plain 'hello' in order to get the 'ok +
>> version number'? I would rather avoid things that other protocols might
>> want to use.
>
> In general servers and clients have to cope with being accidentally connected to
> something speaking another protocol. The worst that will typically happen is
> both sides wait indefinitely for each other to say something, and eventually
> timeout. Often they'll immediately both notice the other end is speaking
> gibberish, and give up.
>
> So I don't think there's much need to worry about this. Do you have a specific
> problem in mind?
Well, I've heard of people multiplexing HTTPS and SSH on port 443 to get
around firewall restrictions, etc. And also, I think it would be good to
be able to multiplex regular HTTP and hpss traffic. So that old clients
can just connect to a plain HTTP port. But I don't have any major issues
here. Just that a plain 'hello' seems a little too simple.
...
>> v-- This brings up something that I've wanted to see for a while. And
>> that is an 'assertRaises' that somehow checks the string representation
>> of the exception. Because we check that exceptions are raised, but not
>> that they will be meaningful to the user when its done.
>>
>>> + try:
>>> + self.transport.get('not%20a%20file')
>>> + except errors.NoSuchFile, e:
>>> + self.assertEqual('not%20a%20file', e.path)
>>> + else:
>>> + self.fail("get did not raise expected error")
>
> Actually, what is better is an assertRaises that returns the exception caught,
> so you can not only check the string representation, but inspect the attributes
> and distinguish subclasses. Maybe we should make a new catchException helper,
> or just extend the existing assertRaises. The Twisted test suite does the
> latter, fwiw.
Well, I'd like to see a base-level function that just does what you say.
But I'd also like to see some sort of helper. The easier we may it to
write complete tests, the more of them we get written. Just returning
the object allows you to check more, but it gets difficult to grab the
object, and then right several more lines to inspect the object.
I don't really know what the exact helper would look like, because there
are a lot of conflicting goals. But it would be nice to write a single
line that checks an exception is raised, and then does some inspection
on that exception.
...
> Transport.close/disconnect is a problematic issue Andrew and I are talking about
> at the moment, probably we'll have a spec for 0.12 about this. Consider that we
> want Transport to behave as a Proxy for a remote directory. For testability and
> model-compatability with HTTP its easier if this is essentially stateless, and
> close/disconnect are most definately not stateless. Consider what happens when
> we have 3 or four transports sharing a single stream-based client - i.e. SFTP
> transports, FTP transports, and HTTP with connection sharing, SmartTransport
> with its client: does disconnect that ONE Transport is finished with, or that
> they all should get disconnected, but an implicit reconnection may occur?
> Consider the impact of that ambiguity with the requirement that we only prompt
> for SSH passwords etc once during a single session with a single remote server.
> (And that we dont ever get a copy of the SSH passphrase either!)
I'm not sure that Tranports are really meant to be stateless. Even with
HTTP, if you have keep-alive, it is no longer 100% stateless. There is
at least a connection that is being maintained.
I do understand the problem we have of multiple Transports all connected
to the same socket, and we are done with one of them, but not all of them.
...
>>> +# TODO: Client and server warnings perhaps should contain some non-ascii bytes
>>> +# to make sure the channel can carry them without trouble? Test for this?
>> ^- Should 'warnings' contain them, or handshaking stuff (like 'hello')
>
> This is about making sure the protocol blows up if the channel is not suitable
> for the rpc-like protocol we are implementing -- that is, that we do not get
> silent failures in integrity. Unfortunately we think that just throwing some
> non-ascii bytes on the wire is insufficient, and instead a cheap checksum in the
> headers of each chunk of data, and possibly the tuple level too, would provide
> much more complete integrity with less chance of missing any imaginable form of
> accidental corruption. So we've updated the TODO.
A checksum seems like a great thing. I know CRC and Adler are both
available through the zlib library. You could always do md5/sha but for
the size of the chunks, and because you are looking for accidental
corruption, not malicious hacking, I think CRC or Adler are sufficient.
In my mind, it would be good to send the checksum as part of the header.
So ok\1checksum: XXYYZZ\1bytes: 100\ndata...done\n. Or however you want
to split it up. \1 or \n, or whatever.
I'm just thinking that doing it at the beginning is better than at the end.
...
>> v-- I didn't see any tests with a LocalTransport as the backing server.
>> It seems reasonable to have one that checks to make sure files show up
>> on the filesystem.
>>
> [...]
>
> There are already tests that check that files turn up on the backing transport.
> I think the coverage of this aspect is already fine.
I didn't see any tests that did 'put('foo')'
self.failUnlessExists('foo'). But that doesn't mean they weren't there.
...
> It's the thing that takes bytes in, dispatches to a handler method, and returns
> bytes. Calling it a "server" seems accurate to us.
Except that you have another "server" layered above it, right? (Whether
it is TCP or SSH, etc). It seems weirder to me to have a Server which
spawns a Server, which creates Server requests. Rather than having a
Handler which hands off to another Handler.
But that is just my 2c.
...
> It might be nice if Transports could choose to just implement put_bytes and have
> put_file forward to that by default, or vice versa, rather than having to
> implement both. Robert suggests we could check in the root put_file and see if
> self.put is the same function as Transport.put, and if thats the case forward to
> put_bytes rather than warning and forwarding to put()?
Well, for WebDav, that is what we did. Only we just explicitly forwarded
over to put_bytes.
I have no problems with something like:
class Transport:
def put(self, ...):
# deprecated warning
if isinstance(x, str):
self.put_bytes()
else:
self.put_file()
def put_file(self ...)
if self.put == Transport.put:
self.put_bytes(f.read())
else: #Assume this is an old Transport that implements put()
self.put(f)
def put_bytes(self, ...):
if self.put == Transport.put:
self.put_file(StringIO(bytes))
else:
self.put(StringIO(bytes))
Then you would only have to implement 1 of put_file() or put_bytes()
however was appropriate.
...
>> v- As mentioned earlier, I think '--port' is a bad name for this parameter.
>>
>>> +class cmd_serve(Command):
>>> + """Run the bzr server.
>>> + """
>>> + takes_options = [
>>> + Option('inet',
>>> + help='serve on stdin/out for use from inetd or sshd'),
>>> + Option('port',
>>> + help='listen for connections on nominated port of the form '
>>> + '[hostname:]portnumber. Passing 0 as the port number will '
>>> + 'result in a dynamically allocated port.',
>>> + type=str),
>>> + Option('directory',
>>> + help='serve contents of directory',
>>> + type=unicode),
>>> + ]
>
> Got a better name? FWIW squid uses http_port to mean a field that accepts
> 'host:port', 'port' etc.
>
> Also we should set a default port for the bzr:// url scheme.
Well, earlier I mentioned '--address'. I don't think it is critical to
not require people to specify a host, since everyone knows about
'localhost'. But, whatever. I do agree that we want to come up with a
relatively unused port for the 'bzr://' scheme. And I'm not super facist
about the --port flag.
>
>> ---
>> Finally... That took a little while.
>>
>> Anyway, I think there is a lot of promise here. But there are a lot of
>> points of design that haven't been revealed until now, so I had a lot of
>> comments to make. Some of it should block on getting it accepted, but
>> not all of it.
>
> I hope we've addressed this sufficiently to get +1
>
> Thanks for the review!
>
> -Andrew and Robert.
It is good enough for me. I realize that this hasn't shown up yet, but
hopefully it will over the weekend. (On IRC I had given you a "+1 if you
do some cleanup"). But to be explicit: +1 for 0.11.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFFCs62JdeBCYSNAAMRAl7WAJwK7lqtPqk60RbzU/Ys6UyqJ5EjEACfQ3tX
RU/0DTzib+BFDZuAYboB+uE=
=rB77
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list