bzr.dev <-> bzr.dev network api break

John Arbash Meinel john at arbash-meinel.com
Mon Jan 21 23:09:25 GMT 2008


Andrew Bennetts wrote:
> John Arbash Meinel wrote:
>> Andrew Bennetts wrote:
> [...]
>>> Oh, and something else that would be nice:
>>>   * allowing 8-bit clean request/response args (i.e. provide a way to 
>>> encode
>>>     0x01 and 0x0a).
>>> -Andrew.
>> Having encountered the bug with Repository.stream_revisions.... where arg 
>> parsing is slow, I would actually push for moving more stuff into the 
>> "body" of the request.
>>
>> Then you could turn them into Length + Data like we do for bulk data. 
>> Probably what I would prefer is something like:
> 
> Well, we can do length-prefixing of arguments without moving them into the body.
> Same benefit to parsing, no change to semantics...

With my definition, it is sort of arbitrary what you call 'body' and 
what you call 'request'. That is why I broke it up into:

REQUEST ::= REQUEST_NAME REQUEST_ARGS BODY

> 
> [I also seem to recall that we can fix the pessmistic reading from pipes without
> changing the protocol, although I'd have to dig through the list archives for
> the details.  But in general I do like length-prefixing.]

I think length prefixing is just a "proper" way of doing it. I think you 
can set pipes to be non-blocking, and then you just get the stream of 
whatever data is available.

What I saw was that you can't use select() on them on Windows. select() 
is only available for sockets (as it is provided by the socket library).

The other reason to length prefix is that if you do a generic solution, 
you can easily clean the sockets for requests you don't understand. 
Because even if you don't know the arguments, you know that there are 
800 bytes you need to throw away.
Of course, with our current encoding you just need to read until a '\n', 
which is also at least request agnostic.

I think we would want to certainly be aware with "protocol 3" that we 
want to have a fairly generic request layout. So the overall "this is a 
request" can be understood without knowing anything else about the 
request. We got close with "protocol 2", we just need to do the next 
couple of pieces.

> 
> [...]
>> I would even go so far as to make fields like num_args and *length 32-bit 
>> big-endian (un?)signed integers. Maybe not, but I do find having to "read a 
>> little bit, oops not done, read a bit more, nope still not done" to be a 
>> poor way to parse streams.
> 
> A tradeoff there is that you introduce finite limits to lengths; in theory, you
> could send many gigabytes in a single body (or chunk) over the smart protocol at
> the moment...
> 
> Of course, we're a long way from having that be practical, and even if were
> practical that doesn't mean it's a good idea.

My feeling was that as long as we have a well formulated "chunk" mode, 
you could always switch it to be <2GB chunks. I realize with our current 
separations we couldn't guarantee that, but it seems like it would be 
pretty easy to add a "if len(chunk) > 2**31:" check.


> 
>> If we do stick with ascii numbers, I think adding terminators is actually a 
>> good thing. As it lets us read a minimum of 2 bytes at a time :).
> 
> Heh.
> 
> -Andrew.
> 
> 

John
=:->



More information about the bazaar mailing list