ssh authorized_keys and known_hosts

Tue Oct 18 15:23:16 UTC 2011

On Tue, Oct 18, 2011 at 4:26 AM, William Reade
<william.reade at canonical.com> wrote:
> Hi all
>
> I've been looking at https://bugs.launchpad.net/bugs/802117 (ensemble
> ssh command should use a different known_hosts file), and I have a
> (multipart) suggestion (which should maybe be broken into separate bugs,
> assuming anyone actually likes the ideas herein):

Awesome, thanks for looking at this.  I just added a comment to the
bug, but I have some working functionality in some shell script code
in lp:bikeshed in the 'cloud-sandbox' script.  It generates 2 sets of
host ssh keys, injects the first one into the instance via metadata
and adds the fingerprint to the a separate local known_hosts.cloud
file, starts the instance, replaces the first host ssh key with a
second one transmitted over ssh (since meta data is visible to all
local users of the system), and prunes the fingerprints when done.

Have a look at the implementation here:
 * http://bazaar.launchpad.net/~bikeshed/bikeshed/trunk/view/head:/cloud-sandbox

> 1) Juju should generate and store an environment-specific SSH key when
> it bootstraps, and should always authorise that key, and use it whenever
> it connects (either when tunnelling to ZK, or when just 'juju ssh'ing).

+1, and those should be per-machine unique too.  Having discussed this
with Kees, Jamie, and Marc on the Security team, I think it best to do
the 2-key generation described above.  Locally and securely generate
two host keys, install both to a separate, known_hosts.juju file.
Install the first key on the system via metadata/cloud-init.  Replace
that key as soon as possible with the second key, transmitted over
ssh.  Oh, and prune the fingerprints when you're done with them (ie,
when you destroy the environment).

In this way, we could *really* improve the security and
user-experience of Juju by removing lots and lots of blind fingerprint
acceptance.

> 2) (most relevant to this bug as stated) machine agents should publish
> their machine's public key to ZK, and any juju-mediated SSHing should
> use: the generated IdentityFile; a temporary UserKnownHostsFile,
> generated on demand, containing just the required host; and
> StrictHostKeyChecking=yes.

Okay, seems reasonable.

> 3) Now, the solution described so far won't work for the bootstrap node
> (you can't get anything out of ZK without an SSH key in the first
> place). So, the *provisioning* agent should publish each machine's
> public key to the provider file storage as soon as it becomes available
> (and delete it when the machine goes away), and anyone wanting to
> connect to a machine should get the public key from file storage rather
> than zookeeper.

Hmm, see my implementation in bikeshed/cloud-sandbox.  I'd be
delighted to pull out the key-generation code into a simple, secure,
shell utility or python library, if you were to use it.  I'm guessing
Juju will want to rewrite it from scratch, though.

> Benefits:
>
> * Nobody has to type "yes".

+100.

> * Nobody hits the "no SSH keys" bootstrap error we occasionally hear of,
> and nobody gets put off by having to generate an SSH key for themselves
> before they can try juju. (OK, it's not a lot of work, but I suspect
> every extra step before you see something working will lose us a certain
> proportion of potential users.)

+1.

> * The initial "juju status" commands (when waiting for bootstrap) can be
> much friendlier: we check for the node's public key in FileStorage, and
> if it's not there we can *quickly* say "Environment not ready: waiting
> for public key from machine/0"; however, once it *is* there, we should
> get quick connections (no waiting for initialisation, because ZK must
> already be up and running for the key to have got from MA to PA before
> being written to FS).

+1.

> * more..?

- Less cluttered $HOME/.ssh/known_hosts file.  My known_hosts file has
grown up to thousands of entries, having run thousands of ephemeral
instances in EC2, OpenStack, Eucalyptus, Juju, etc.  Most of these are
completely useless/worthless, as the instance is very long gone.  Put
the Juju ones in a file of its own.  Prune the entries when Juju is
done with them.

 - This is a more secure solution for several reasons.  For one thing,
the fingerprints are in fact *validated*, even though you're not just
typing 'yes' (which, unless you look at ec2-get-console, you're just
doing blindly).  Moreover, your local physical machine has more
entropy and therefore keys you generate there are more secure than
ones that some cloud instances (which is nearly identical to a million
other cloud instances) generates.

> Drawbacks:
>
> * No mention is made of what we should do if we want to connect with
> additional keys (but really, why would we want to if we can already
> 'juju ssh' to any machine?). This is a pre-existing problem, anyway, and
> we'll need to do some work on it regardless (if it is an important use
> case).

Please, please, please, please use ssh-import-id here!  Allow the user
to set an environment variable, SSH_IMPORT_ID, for instance.  If that
variable is set, then add the metadata/cloud-init stanza that also
securely ssh-import-id's that whitespace separated list of ID's.

Again, cloud-sandbox in lp:bikeshed does this.  I think it actually
looks for the environment variable LAUNCHPAD_ID.  Feel free to
bikeshed over the name of that variable, and just tell us what you
decide on ;-)

> * Sharing the juju admin key between different clients will be a hassle,
> but no more so than sharing environment config already is. (I guess this
> is maybe a reason to authorise multiple keys; but again, this relates to
> the existing problem, and we probably want something like "juju access
> grant|revoke /some/random/id_rsa.pub" command[0].

Heh, sounds like a traditional configuration management kind of
problem, no?  :-)

> * more..?
>
> Does anyone have any objections to this, before I get too deeply into
> this? Tentatively, it feels like it could be broken down into the
> following bugs:

Thanks for tackling this, William.  Please give a little thought to
the code and feedback I've shared above.  I think most of this should
be pretty straightforward to solve and should make for a much better
Jujuser-experience.

> * machines' public keys should be published to filestorage by the
> provisioning agent; and juju should specify the host's public key,
> acquired therefrom, before attempting to connect via ssh.
> * juju should generate (and authorise) its own ssh key at bootstrap
> time, and always use that to connect to machines.
> * add "juju access" subcommand to manage authorised keys, and necessary
> infrastructure to keep them up to date on machines.
>
> ...each of which is reasonably independent, and gives some sort of value
> on its own. Thoughts?
>
> Cheers
> William
>
> [0] I suppose "juju access revoke ~/.juju/myenv/id_rsa.pub" could be a
> problem... details, details.
>
>
> --
> Juju mailing list
> Juju at lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju
>
>

-- 
:-Dustin

Dustin Kirkland
Manager, Systems Integration
Corporate Services
Canonical, LTD