[MERGE] Support for Putty SSH implementation

Mon Jan 8 02:01:57 GMT 2007

Dmitry Vasiliev wrote:
> 
> The attached patch adds support for Putty's plink SSH implementation. 
> Also slightly refactored SSH negotiation code and added some tests.

I like the look of this.  Some comments, mostly about making the intent of some
code clearer:

bzrlib/tests/test_ssh_transport.py
----------------------------------

[...]
> +
> +    def test_default_vendor(self):
> +        manager = TestSSHVendorManager()
> +        self.assertRaises(SSHVendorNotFound, manager.get_vendor, {})
> +        manager.register_vendor("default", "VENDOR")
> +        self.assertEqual(manager.get_vendor({}), "VENDOR")

An explicit comment or docstring in the test stating that "default" as a vendor
name is treated specially would be good.  It's not hard to guess, but making the
intent explicit is good, in case someone guesses incorrectly :)

> +    def test_cached_vendor(self):
> +        manager = TestSSHVendorManager()
> +        self.assertRaises(SSHVendorNotFound, manager.get_vendor, {})
> +        manager.register_vendor("vendor", "VENDOR")
> +        self.assertRaises(SSHVendorNotFound, manager.get_vendor, {})
> +        self.assertEqual(manager.get_vendor({"BZR_SSH": "vendor"}), "VENDOR")
> +        self.assertEqual(manager.get_vendor({}), "VENDOR")

The intented behaviour here isn't immediately obvious to me.  Why should
manager.get_vendor({"BZR_SSH": "vendor"}) change the result of a later
manager.get_vendor({}) call?

Basically, I'd like to see a comment (or docstring) in this test explain the
intent.  Something like "Once SSHVendorManager.get_vendor successfully returns
a vendor, later invocations will return the same vendor, regardless of changes
to the environment.", but that doesn't sound quite right to me.  How you would
explain this behaviour, and the reason for it?

I'd also be tempted to use something other than "VENDOR" here, e.g. "vendor =
object()", so that you can add "self.assertIdentical(vendor,
manager.get_vendor(...))", i.e. demonstrate that it's the exact same object.
(You could technically do it with the string object, but string identity is
mildly confusing due to interning, so using an object would be better.)

> +    def test_vendor_getting_methods_precedence(self):
> +        manager = TestSSHVendorManager()
> +        self.assertRaises(SSHVendorNotFound, manager.get_vendor, {})
> +
> +        manager.register_vendor("default", "DEFAULT")
> +        self.assertEqual(manager.get_vendor({}), "DEFAULT")
> +
> +        manager.ssh_vendor = None
> +        manager.set_ssh_version_string("OpenSSH")
> +        self.assertIsInstance(manager.get_vendor({}), OpenSSHSubprocessVendor)
> +
> +        manager.ssh_vendor = None
> +        manager.register_vendor("vendor", "VENDOR")
> +        self.assertEqual(manager.get_vendor({"BZR_SSH": "vendor"}), "VENDOR")
> +
> +        self.assertEqual(manager.get_vendor({}), "VENDOR")

This test could use some comments as well.  What's the "precendence" you're
testing?  Peeking at the implementation, I guess this is testing the order that
get_vendor attempts to choose a vendor, i.e. by environment variable, by
inspection, by default, and finally by giving up.  Say so explicitly in the test
(including naming the stages in the test)!

Also, the purpose of the "manager.ssh_vendor = None" lines probably deserves a
comment explaining that this is clearing the cache.  Perhaps it would be clearer
to create a fresh TestSSHVendorManager each time, although you'd have to repeat
some registrations, so perhaps not.

I'm mildly surprised that "ssh_vendor" isn't "_ssh_vendor", it doesn't look like
something that most code ought to be mucking with.

NEWS
----

>    IMPROVEMENTS:
>  
> +    * Added support for Putty's SSH implementation. (Dmitry Vasiliev)
> +

You're being overly modest.  You've done a nice cleanup of SSH vendor
registration too! :)   (although mentioning that probably belongs under the
INTERNALS heading).

bzrlib/errors.py
----------------

[...]
> +class SSHVendorNotFound(BzrError):
> +
> +    _fmt = "Don't know how to handle SSH connections. Please set BZR_SSH environment variable."

Lines should have no more than 79 characters (PEP 8).

bzrlib/transport/ssh.py
------------------------

[...]
>                             SocketConnectionError,
>                             TransportError,
>                             UnknownSSH,
> +                           SSHVendorNotFound,
>                             )

I don't think this convention is documented anywhere (I don't see it in HACKING
or PEP 8), but the other names being imported from bzrlib.errors there are in
alphabetical order; I think it would be good to keep that ordering.

>  from bzrlib.osutils import pathjoin
> @@ -58,55 +59,79 @@
>  # connect to an agent if we are on win32 and using Paramiko older than 1.6
>  _use_ssh_agent = (sys.platform != 'win32' or _paramiko_version >= (1, 6, 0))
>  
> -_ssh_vendors = {}
> -
> -def register_ssh_vendor(name, vendor):
> -    """Register SSH vendor."""
> -    _ssh_vendors[name] = vendor
> -
> -    
> -_ssh_vendor = None
> -def _get_ssh_vendor():
> -    """Find out what version of SSH is on the system."""
> -    global _ssh_vendor
> -    if _ssh_vendor is not None:
> -        return _ssh_vendor
> -
> -    if 'BZR_SSH' in os.environ:
> -        vendor_name = os.environ['BZR_SSH']
> +
> +class SSHVendorManager(object):
> +    """Manager for manage SSH vendors."""

Hmm, it's tempting to making this a bzrlib.registry.Registry, until you realise
that the interface doesn't quite fit; instead of "get(key)" this essentially
just has "get()", which chooses the best registered vendor for the current
environment (I mean environment in the broader sense, not just the "os.environ"
sense).

It may be worth adding a brief comment highlighting this fundamental difference,
to stop people (like me) from chasing foolish consistency by trying reuse
Registry here :)

[...]
> +    def _get_vendor_by_environment(self, environment=None):
> +        if environment is None:
> +            environment = os.environ
> +        if 'BZR_SSH' in environment:
> +            vendor_name = environment['BZR_SSH']
> +            try:
> +                vendor = self._ssh_vendors[vendor_name]
> +            except KeyError:
> +                raise UnknownSSH(vendor_name)
> +            return vendor
> +        return None

A comment in English explaining the intent would be good.  Even though it's not
hard to reverse-engineer the intent from the current implementation, code
evolves, and intent that seemed clear will get obscured unless it's stated
explicitly.

So, add a comment saying "If the user has set the BZR_SSH environment variable,
use that to choose the vendor." 

> +    def _get_vendor_by_version_string(self, version):
> +        vendor = None
> +        if 'OpenSSH' in version:
> +            mutter('ssh implementation is OpenSSH')
> +            vendor = OpenSSHSubprocessVendor()
> +        elif 'SSH Secure Shell' in version:
> +            mutter('ssh implementation is SSH Corp.')
> +            vendor = SSHCorpSubprocessVendor()
> +        elif 'plink' in version:
> +            mutter("ssh implementation is Putty's plink.")
> +            vendor = PLinkSubprocessVendor()
> +        return vendor

A docstring mentioning that "version" is expected to be the output of executing
a command like 'ssh -V' would be good.

> +    def get_vendor(self, environment=None):
> +        """Find out what version of SSH is on the system."""
> +        if self.ssh_vendor is None:
> +            vendor = self._get_vendor_by_environment(environment)
> +            if vendor is None:
> +                vendor = self._get_vendor_by_inspection()
> +                if vendor is None:
> +                    mutter('falling back to default implementation')
> +                    vendor = self._ssh_vendors.get('default', None)
> +                    if vendor is None:
> +                        raise SSHVendorNotFound()
> +            self.ssh_vendor = vendor
> +        return self.ssh_vendor

The docstring for get_vendor ought to mention that it can raise
SSHVendorNotFound or UnknownSSH.

> +class PLinkSubprocessVendor(SubprocessVendor):
> +    """SSH vendor that uses the 'plink' executable from Putty."""
> +
> +    def _get_vendor_specific_argv(self, username, host, port, subsystem=None,
> +                                  command=None):
> +        assert subsystem is not None or command is not None, (
> +            'Must specify a command or subsystem')
> +        if subsystem is not None:
> +            assert command is None, (
> +                'subsystem and command are mutually exclusive')

Hmm, those asserts probably belong in the base class, rather than repeated.
That's not a new problem in your branch, though.

> +        args = ['plink', '-x', '-a', '-ssh', '-2']
> +        if port is not None:
> +            args.extend(['-P', str(port)])
> +        if username is not None:
> +            args.extend(['-l', username])
> +        if subsystem is not None:
> +            args.extend(['-s', subsystem, host])
> +        else:
> +            args.extend([host] + command)
> +        return args
> +
> +register_ssh_vendor('plink', PLinkSubprocessVendor())

It's good to see that the work to actually define and register a new vendor is
pretty minimal.

Overall, I'd be happy to see this merged.  +1 with my comments taken in account.

-Andrew.