PING: reviews needed

Fri Nov 10 17:03:22 GMT 2006

On Wed, Nov 08, 2006 at 11:29:36AM -0600, John Arbash Meinel wrote:
> Andrew Bennetts wrote:
> > Hi all,
> > 
> > I'm still after reviews for the following branches:
> > 
> >   * http://people.ubuntu.com/~andrew/bzr/chroot-transport/
> >     Adds a ChrootTransportDecorator.  This is so I can setup a HTTP smart server
> >     serving e.g. /srv/project.com/source without exposing the rest of the file
> >     system.
> 
> I gave you feedback on this one. It seems like it works, but it seems
> like it is a band-aid rather than fixing the real problem. If it is
> required to get proper server-side support without exposing the whole
> filesystem, then we can merge it.
> 
> We would want some sort of 'chroot+' anyway, so it would all be possible
> after the fact, without having to deprecate anything.

I've replied to your review of this seperately.

> >   * http://people.ubuntu.com/~andrew/bzr/http-smart-server/
> >     (depends on chroot-transport).  This is just a near-trivial bug fix to make
> >     sure that a request indicates it needs no more data once its response has
> >     been sent.  
> 
> Well, I thought we had a less-than trivial case here, because the diff
> against chroot-transport is 3.9K lines. But it turns out that is because
> you merged memory-something into it. Which has already been merged into
> bzr.dev.
> 
> I don't remember you posting this as a request before. But +1 to the
> changes.
> 
> This needs to be cleaned up to merge cleanly, because of the convoluted
> ancestry, it can't seem to find a clean base. You have chroot-transport
> in there, and then you merged the 'memory' fixes, but the second ones
> have been merged into bzr.dev, but not the first, etc, etc.

I merged the latest bzr.dev into this (and chroot-transport), so hopefully the
convulted merge history won't matter anymore -- certainly it merges cleaning
into bzr.dev for me.

The diff to examine would be the one against chroot-transport.  It's quite
short:

 tests/test_smart_transport.py |   15 ++++++---------
 transport/smart.py            |   15 ++++++---------
 2 files changed, 12 insertions(+), 18 deletions(-)

I've attached this diff for convenience.

> I did find that 'bzr remerge --format=weave' helped, though on some
> files it did worse than 'bzr remerge --reprocess'. (bzrlib/errors.py was
> one of those). Anyway, I think I was able to sort through all of it.

This branch doesn't touch errors.py that I can see!  It must have picked a
really weird base.

> >   * http://people.ubuntu.com/~andrew/bzr/smart-server-unicode/
> >     (depends on http-smart-server).  Don't always decode/encode request/response
> >     tuples as UTF-8, because sometimes we actually want the bytestrings.
> 
> This is pretty hard to read, because you merged lots of things into this
> branch. So there isn't a simple ancestor that we can use to see what the
> real changes are. I went ahead and tried to clean up a merge from
> http-smart-server, so I would be able to review it. But I didn't spend a
> lot of time making sure my merge was correct.

Again, this is a branch off the previous branch (in this case
http-smart-server).  A diff against that branch is small:

 tests/test_smart_transport.py |   34 ++++++++++++++++++++--------------
 transport/smart.py            |   23 +++++++++--------------
 2 files changed, 29 insertions(+), 28 deletions(-)

I've also attached this diff for convenience.

> After sorting through all of the diff and merge issues, it seems pretty
> straightforward. Basically you changed the smart protocol to not decode
> automatically, and move decoding up higher in the stack. The only
> request I remember asking for was that you use something that needs to
> be url escaped (so it has a '+' or a '%' or something like that). So
> that you are actually thoroughly testing the request interface. That is
> actually more important than Unicode at the moment, since we don't make
> any unicode requests (yet).
> 
> Otherwise +1 from me.

Thanks.  I'll make sure to take care of your url escaping request.

> >   * http://people.ubuntu.com/~andrew/bzr/wsgi-smart-server/
> >     (depends on smart-server-unicode).  Actually, this has mostly been reviewed
> >     once by John already.  The only changes since then have been to make use of
> >     ChrootTransportDecorator so that it's reasonably secure by default.
> > 
> > -Andrew.
> 
> This last one looks fine too. So if you can make these clean to be
> merged, I'm okay with merging it. Though I would like to make a request
> for cleaning up Transport actions, rather than the current
> implementation of ChrootTransport.

This branch should use whatever chrooting ends up in bzr.dev, whether that's my
decorator or some other solution.  From the perspective of this branch, I don't
care what the solution is, so long as it's there for me to use :)

-Andrew.

-------------- next part --------------
=== modified file 'bzrlib/tests/test_smart_transport.py'

--- bzrlib/tests/test_smart_transport.py	2006-10-21 02:33:37 +0000
+++ bzrlib/tests/test_smart_transport.py	2006-11-10 16:44:12 +0000
@@ -1110,7 +1110,7 @@
         self.assertOffsetSerialisation([(1,2), (3,4), (100, 200)],
             '1,2\n3,4\n100,200', self.client_protocol, self.smart_server_request)
 
-    def test_accept_bytes_to_protocol(self):
+    def test_accept_bytes_of_bad_request_to_protocol(self):
         out_stream = StringIO()
         protocol = smart.SmartServerRequestProtocolOne(None, out_stream.write)
         protocol.accept_bytes('abc')
@@ -1119,7 +1119,7 @@
         self.assertEqual("error\x01Generic bzr smart protocol error: bad request"
             " u'abc'\n", out_stream.getvalue())
         self.assertTrue(protocol.has_dispatched)
-        self.assertEqual(1, protocol.next_read_size())
+        self.assertEqual(0, protocol.next_read_size())
 
     def test_accept_bytes_with_invalid_utf8_to_protocol(self):
         out_stream = StringIO()
@@ -1131,7 +1131,7 @@
             "one or more arguments of request '\\xdd\\n' are not valid UTF-8\n",
             out_stream.getvalue())
         self.assertTrue(protocol.has_dispatched)
-        self.assertEqual(1, protocol.next_read_size())
+        self.assertEqual(0, protocol.next_read_size())
 
     def test_accept_body_bytes_to_protocol(self):
         protocol = self.build_protocol_waiting_for_body()
@@ -1184,13 +1184,10 @@
         self.assertEqual("hello\n", protocol.excess_buffer)
         self.assertEqual("", protocol.in_buffer)
 
-    def test_sync_with_request_sets_finished_reading(self):
-        protocol = smart.SmartServerRequestProtocolOne(None, None)
-        request = smart.SmartServerRequestHandler(None)
-        protocol.sync_with_request(request)
+    def test__send_response_sets_finished_reading(self):
+        protocol = smart.SmartServerRequestProtocolOne(None, lambda x: None)
         self.assertEqual(1, protocol.next_read_size())
-        request.finished_reading = True
-        protocol.sync_with_request(request)
+        protocol._send_response(('x',))
         self.assertEqual(0, protocol.next_read_size())
 
     def test_query_version(self):

=== modified file 'bzrlib/transport/smart.py'
--- bzrlib/transport/smart.py	2006-10-09 07:34:15 +0000
+++ bzrlib/transport/smart.py	2006-11-10 16:44:12 +0000
@@ -269,7 +269,7 @@
     def __init__(self, backing_transport, write_func):
         self._backing_transport = backing_transport
         self.excess_buffer = ''
-        self._finished_reading = False
+        self._finished = False
         self.in_buffer = ''
         self.has_dispatched = False
         self.request = None
@@ -301,16 +301,15 @@
                     self.in_buffer = ''
                     self._send_response(self.request.response.args,
                         self.request.response.body)
-                self.sync_with_request(self.request)
             except KeyboardInterrupt:
                 raise
             except Exception, exception:
                 # everything else: pass to client, flush, and quit
                 self._send_response(('error', str(exception)))
-                return None
+                return
 
         if self.has_dispatched:
-            if self._finished_reading:
+            if self._finished:
                 # nothing to do.XXX: this routine should be a single state 
                 # machine too.
                 self.excess_buffer += self.in_buffer
@@ -326,7 +325,6 @@
                 self.request.end_of_body()
                 assert self.request.finished_reading, \
                     "no more body, request not finished"
-            self.sync_with_request(self.request)
             if self.request.response is not None:
                 self._send_response(self.request.response.args,
                     self.request.response.body)
@@ -338,17 +336,16 @@
 
     def _send_response(self, args, body=None):
         """Send a smart server response down the output stream."""
+        assert not self._finished, 'response already sent'
+        self._finished = True
         self._write_func(_encode_tuple(args))
         if body is not None:
             assert isinstance(body, str), 'body must be a str'
             bytes = self._encode_bulk_data(body)
             self._write_func(bytes)
 
-    def sync_with_request(self, request):
-        self._finished_reading = request.finished_reading
-        
     def next_read_size(self):
-        if self._finished_reading:
+        if self._finished:
             return 0
         if self._body_decoder is None:
             return 1

-------------- next part --------------
=== modified file 'bzrlib/tests/test_smart_transport.py'
--- bzrlib/tests/test_smart_transport.py	2006-11-10 16:44:31 +0000
+++ bzrlib/tests/test_smart_transport.py	2006-11-10 16:52:59 +0000
@@ -596,7 +596,7 @@
         self.assertEqual('ok\0011\n',
                          from_server.getvalue())
 
-    def test_canned_get_response(self):
+    def test_response_to_canned_get(self):
         transport = memory.MemoryTransport('memory:///')
         transport.put_bytes('testfile', 'contents\nof\nfile\n')
         to_server = StringIO('get\001./testfile\n')
@@ -612,6 +612,24 @@
                          'done\n',
                          from_server.getvalue())
 
+    def test_response_to_canned_get_of_utf8(self):
+        # wire-to-wire, using the whole stack, with a UTF-8 filename.
+        transport = memory.MemoryTransport('memory:///')
+        utf8_filename = u'testfile\N{INTERROBANG}'.encode('utf-8')
+        transport.put_bytes(utf8_filename, 'contents\nof\nfile\n')
+        to_server = StringIO('get\001' + utf8_filename + '\n')
+        from_server = StringIO()
+        server = smart.SmartServerPipeStreamMedium(
+            to_server, from_server, transport)
+        protocol = smart.SmartServerRequestProtocolOne(transport,
+                from_server.write)
+        server._serve_one_request(protocol)
+        self.assertEqual('ok\n'
+                         '17\n'
+                         'contents\nof\nfile\n'
+                         'done\n',
+                         from_server.getvalue())
+
     def test_pipe_like_stream_with_bulk_data(self):
         sample_request_bytes = 'command\n9\nbulk datadone\n'
         to_server = StringIO(sample_request_bytes)
@@ -1117,19 +1135,7 @@
         self.assertEqual('abc', protocol.in_buffer)
         protocol.accept_bytes('\n')
         self.assertEqual("error\x01Generic bzr smart protocol error: bad request"
-            " u'abc'\n", out_stream.getvalue())
-        self.assertTrue(protocol.has_dispatched)
-        self.assertEqual(0, protocol.next_read_size())
-
-    def test_accept_bytes_with_invalid_utf8_to_protocol(self):
-        out_stream = StringIO()
-        protocol = smart.SmartServerRequestProtocolOne(None, out_stream.write)
-        # the byte 0xdd is not a valid UTF-8 string.
-        protocol.accept_bytes('\xdd\n')
-        self.assertEqual(
-            "error\x01Generic bzr smart protocol error: "
-            "one or more arguments of request '\\xdd\\n' are not valid UTF-8\n",
-            out_stream.getvalue())
+            " 'abc'\n", out_stream.getvalue())
         self.assertTrue(protocol.has_dispatched)
         self.assertEqual(0, protocol.next_read_size())
 

=== modified file 'bzrlib/transport/smart.py'
--- bzrlib/transport/smart.py	2006-10-13 06:07:17 +0000
+++ bzrlib/transport/smart.py	2006-11-10 16:52:59 +0000
@@ -233,17 +233,12 @@
         return None
     if req_line[-1] != '\n':
         raise errors.SmartProtocolError("request %r not terminated" % req_line)
-    try:
-        return tuple((a.decode('utf-8') for a in req_line[:-1].split('\x01')))
-    except UnicodeDecodeError:
-        raise errors.SmartProtocolError(
-            "one or more arguments of request %r are not valid UTF-8"
-            % req_line)
+    return tuple(req_line[:-1].split('\x01'))
 
 
 def _encode_tuple(args):
     """Encode the tuple args to a bytestream."""
-    return '\x01'.join((a.encode('utf-8') for a in args)) + '\n'
+    return '\x01'.join(args) + '\n'
 
 
 class SmartProtocolBase(object):
@@ -660,10 +655,8 @@
     def do_delete(self, relpath):
         self._backing_transport.delete(relpath)
 
-    def do_iter_files_recursive(self, abspath):
-        # XXX: the path handling needs some thought.
-        #relpath = self._backing_transport.relpath(abspath)
-        transport = self._backing_transport.clone(abspath)
+    def do_iter_files_recursive(self, relpath):
+        transport = self._backing_transport.clone(relpath)
         filenames = transport.iter_files_recursive()
         return SmartServerResponse(('names',) + tuple(filenames))
 
@@ -799,9 +792,11 @@
             # with a plain string
             str_or_unicode = e.object
             if isinstance(str_or_unicode, unicode):
-                val = u'u:' + str_or_unicode
+                # XXX: UTF-8 might have \x01 (our seperator byte) in it.  We
+                # should escape it somehow.
+                val = 'u:' + str_or_unicode.encode('utf-8')
             else:
-                val = u's:' + str_or_unicode.encode('base64')
+                val = 's:' + str_or_unicode.encode('base64')
             # This handles UnicodeEncodeError or UnicodeDecodeError
             return SmartServerResponse((e.__class__.__name__,
                     e.encoding, val, str(e.start), str(e.end), e.reason))
@@ -1218,7 +1213,7 @@
             end = int(resp[4])
             reason = str(resp[5]) # reason must always be a string
             if val.startswith('u:'):
-                val = val[2:]
+                val = val[2:].decode('utf-8')
             elif val.startswith('s:'):
                 val = val[2:].decode('base64')
             if what == 'UnicodeDecodeError':