[MERGE] Add some unicode-related tests from the hpss branch, and a few other nits (also from the hpss branch).

Mon Apr 16 03:43:19 BST 2007

John Arbash Meinel wrote:
> John Arbash Meinel has voted +0.
> Status is now: Semi-approved
> Comment:
> Naming the functions 'call' and 'call2' isn't very obvious. Certainly 
> from the name alone I don't know what the difference is.
> 
> Maybe "call()" and "call_with_protocol()".
> 
> Or maybe the other way around "call_no_body()" and "call()". (doesn't 
> really work because you have 'call_with_body' to indicate you are 
> sending data).

Changed the name and docstring to:

    def call_expecting_body(self, method, *args):
        """Call a method and return the result and the protocol object.

        The body can be read from the returned protocol object like so::

            result, smart_protocol = smart_client.call_expecting_body(...)
            body = smart_protocol.read_body_bytes()
        """

I think that makes the purpose and typical use clearer.

> Is "type(X) is str" a good way to do it? I've also see "x.__class__ == 
> str" and "isinstance(x, str)". (since basestring includes unicode).
> 
> I would like us to at least be consistent about it. I'm pretty sure 
> these are just meant to trap against Unicode (because non-string objects 
> would fail for other reasons)

It's much of a muchness, really.  I don't like "x.__class__ == str", because
"type(x) is str" says the same thing more clearly.  "isinstance(x, str)" is
effectively the same (str is very rarely subclassed, and people that do it
deserve whatever they get...), but I always have to hesistate momentarily to
remember which way round the arguments go.  So I tend to choose "type(x) is
str".  It's a pretty minor detail.

> You also require args to be a tuple. Do you think someone might call it 
> with a list? The requirement seems more accidental to me, but:
> 
> smart_protocol.call_with_body_bytes((method, ) + args, body)
> 
> will fail for anything but args == tuple.

I think this is a YAGNI.  It's easy to make more flexible later if we decide
it's worth it, but so far the hpss branch hasn't had trouble with using tuples.
Perhaps because the current protocol is largely about encoding and decoding
tuples?  Anyway, I'm not worried about this.

> Is this more "remote_path_for_transport"?
> +
> +    def remote_path_from_transport(self, transport):
> +        """Convert transport into a path suitable for using in a 
> request."""
> +        return unescape(urlparse(transport.base)[2]).encode('utf8')
> 
> since you are converting transport.base.
> 
> Maybe just making the docstring indicate that you are using the base 
> location of the transport.

"From" seems like the right word to me.  The situation is given a transport for
some location, what's the remote path (i.e. the path string to send to the
remote end) to use to operate on that location?  This function calculates that
remote path from the transport.  I suppose "for" would be ok too, but I don't
see how it's any better.

Although it seems almost too obvious to be worth stating, I've expanded the
docstring like so:

    def remote_path_from_transport(self, transport):
        """Convert transport into a path suitable for using in a request.

        Note that the resulting remote path doesn't encode the host name or
        anything but path, so it is only safe to use it in requests sent over
        the medium from the matching transport.
        """

> You don't have any direct tests on SmartClient. And shouldn't it be 
> called RemoteClient? (maybe I'm missing when and what things were 
> renamed)

SmartClient is reasonable name, it is a client for making smart protocol
requests with.  (Unlike "SmartTransport", SmartClient is an implementation
detail of the smart protocol code, and doesn't imply that it is an object that
is operating in a smart manner, which was we changed "SmartTransport" to
"RemoteTransport": SmartTransports aren't any more efficient, or smarter, than
e.g. SFTPTransports). 

The lack of direct tests is a shame, but (in the full hpss branch) this code is
fully tested indirectly.  SmartClient was a straightforward refactoring of
existing tested code to remove duplication.  So it does get thoroughly
exercised, and it is very simple anyhow.

I'd like there to be direct tests, but at this point it's not a big concern for
me; I'm confident in this code anyway, and SmartClient doesn't change very much
so the lack of more focussed tests isn't much of a burden to developing it.  If
SmartClient evolves more later, that would be a good time to define tests for it
I think.

> +    def test_call_with_body_bytes_unicode_args(self):
> +        self.assertCallDoesNotBreakMedium('method', (u'args',), 'body')
> 
> I think it is also reasonable to test that the second argument being 
> unicode would raise TypeError.
> +        self.assertCallDoesNotBreakMedium('method', ('a', u'arg'), 
> 'body')
> 
> I may be a little overzealous, but I think in a case where you can have 
> more than 1, you should test for more than 1. I think what I read was 
> that you should actually test for 1 and 3. (1 could easily be a special 
> case, IMO)

Well, in my view the point of testing is to give you confidence that the code
works.  The extra confidence I would gain by adding arbitrary cases like that is
minimal here, because it seems to me the code would have to be intentionally
perverse to fail like that.

Still, it's just one line, so I've added:

        self.assertCallDoesNotBreakMedium('method', ('arg1', u'arg2'), 'body')

> Why not actually raise a specific error here:
> +    def test_put_bytes_unicode(self):
> +        # Expect put_bytes to raise AssertionError or 
> UnicodeEncodeError if
> +        # given unicode "bytes".  UnicodeEncodeError doesn't really 
> make sense
> 
> 
> It also gives you a proper error to raise from MemoryError.
> 
> Why not raise "TypeError" instead?

As the comment goes on to say:

        #                                   ... but we allow it for backwards
        # compatibility.  At some point we should use a specific exception.

I agree a specific exception would be better, but because fixing that wasn't the
point of this change, I was being conservative.  I'm just making the existing
behaviour consistent (even though it's not ideal); making it better can happen
separately.

I've filed ://bugs.launchpad.net/bzr/+bug/106898, and linked to it from this
comment.

> Also, I would make sure that these tests pass under "python -O ./bzr 
> selftest". I generally frown on seeing a test expecting AssertionError 
> to be raised. Certainly the underlying code could use "if not foo: raise 
> AssertionError" rather than "assert foo". But it isn't a good exception 
> to be using, especially when you codify that is should be used by the 
> test suite.

Yep, it passes.  Worth checking though! :)

Any chance of an upgrade to +1?

-Andrew.