[Bug 559230] Re: multi-machine topology, cannot reach an instance from the CLC
C de-Avillez
hggdh2 at gmail.com
Wed Apr 14 01:48:28 BST 2010
I tested it again thie evening, with Dustin monitoring. We again used
lucid-amd64-topo2, and based the installs on the daily server/UEC images
(releases.ubuntu.com is not accessible from tamarind, so I could not use
Beta2).
Installation was uneventful.
I then ran the config_single.yaml test. No problems starting instances,
but still the script (or even I, manually) could not ssh into them,
failing with a timeout.
ran, just for the sake of it (I do not know what is, or is not, blocked
by the firewall(s)) a traceroute agaisnt one of the instances, from
cepedak. It reached marula (the CC), and then starred all.
I then logged in Marula, and ssh-ed to an instance I had manually
started. I *could* reach it (but failed, correctly, on public key -- I
had not added a new key for this run, and the ones used by uec_test.py
had already been revoked).
This is the log of the IRC chat between Dustin and myself:
2010-04-13 18:25:32 hggdh kirkland: nodes registered, running a single-instance test now
2010-04-13 18:33:02 hggdh kirkland: test running, log is being written to ~/uec-testing-scripts/resutls/single*
2010-04-13 18:33:09 hggdh kirkland: on cempedak
2010-04-13 18:33:20 kirkland hggdh: cool, and you can ssh in?
2010-04-13 18:35:08 hggdh kirkland: negative
2010-04-13 18:35:19 kirkland hggdh: cannot ssh in
2010-04-13 18:35:25 hggdh kirkland: ssh fails on timeout
2010-04-13 18:35:31 hggdh really sounds like routing
2010-04-13 18:36:18 kirkland hggdh: interesting
2010-04-13 18:36:25 kirkland hggdh: okay, put the log somewhere for me to check out
2010-04-13 18:38:27 hggdh kirkland: k. I just ran one instance by hand, and then tried to ssh into it -- fails with a timeout
2010-04-13 18:39:25 kirkland hggdh: okay, that's easy to reproduce
2010-04-13 18:39:27 kirkland hggdh: log?
2010-04-13 18:42:29 hggdh kirkland: people.c.c/~cerdea/single_test.log.2010-04-13_193218
2010-04-13 18:46:15 kirkland hggdh: rsync -aP people.canonical.com:~cerdea/single_test.log.2010-04-13_193218 .
2010-04-13 18:46:20 kirkland hggdh: file not found
2010-04-13 18:47:04 kirkland hggdh: found it, public_html
2010-04-13 18:47:27 hggdh heh. one wants it on public_html, another on the root ;-)
2010-04-13 18:49:35 kirkland hggdh: ls -alF users/admin/uectest-k0.priv
2010-04-13 18:50:07 kirkland hggdh: and cat that file, make sure it matches -----BEGIN RSA PRIVATE KEY-----
2010-04-13 18:50:33 kirkland hggdh: is that instance still running?
2010-04-13 18:50:43 kirkland hggdh: can you telnet to its port 22 ?
2010-04-13 18:51:03 hggdh kirkland: yes, the instance is still running
2010-04-13 18:52:00 hggdh kirkland: the priv key seems kosher
2010-04-13 18:52:27 kirkland hggdh: and telnet ?
2010-04-13 18:53:50 hggdh kirkland: timeout. Also, a traceroute (FWIW) reaches marula (the CC) and stops there
2010-04-13 18:54:07 kirkland hggdh: oh, interesting
2010-04-13 18:54:22 kirkland hggdh: that's got to be it
2010-04-13 18:54:25 hggdh kirkland: let me try to ssh from marula
2010-04-13 18:54:38 kirkland hggdh: yeah
2010-04-13 18:54:43 kirkland hggdh: scp the priv key over
2010-04-13 18:54:47 kirkland hggdh: and try from there
2010-04-13 18:55:15 hggdh kirkland: first test -- reachability -- successful
2010-04-13 18:55:21 hggdh will move the priv key there now
2010-04-13 18:55:21 kirkland hggdh: ack
2010-04-13 19:00:03 kirkland hggdh: and?
2010-04-13 19:00:13 hggdh kirkland: getting permission denied (pub key)
2010-04-13 19:00:30 hggdh kirkland: but the important piece is that I am *reaching* the instance
2010-04-13 19:00:34 kirkland hggdh: hrm, odd
2010-04-13 19:00:38 kirkland hggdh: agreed on that point
2010-04-13 19:00:49 kirkland hggdh: and you're doing ssh -i ./whatever.priv ubuntu at ip ?
2010-04-13 19:00:58 kirkland hggdh: and whatever.priv is perm'd 600
2010-04-13 19:01:17 hggdh kirkland: yes indeed, and will check again
2010-04-13 19:01:26 hggdh but on wrong permission ssh would bail out
2010-04-13 19:03:41 hggdh kirkland: and the full command is ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ./uectest-k0.priv ubuntu at 10.55.55.100
2010-04-13 19:04:07 hggdh although sort of overworked, I admit
2010-04-13 19:04:24 kirkland hggdh: hmm, okay
2010-04-13 19:04:35 kirkland hggdh: it may be that the guest is having trouble getting out
2010-04-13 19:04:48 kirkland hggdh: or at least to have the key injected
2010-04-13 19:04:58 kirkland hggdh: okay, add your traceroute findings to that bug
2010-04-13 19:05:11 kirkland hggdh: and email mathias (cc me) the link to that log
2010-04-13 19:05:33 kirkland hggdh: i'm reassured that this appears to be a networking issue, but we'll need to get to the bottom of it
2010-04-13 19:05:38 kirkland hggdh: i gotta run for the night
2010-04-13 19:05:41 kirkland hggdh: thanks dude!
2010-04-13 19:05:55 hggdh kirkland: will do, and g'night
--
multi-machine topology, cannot reach an instance from the CLC
https://bugs.launchpad.net/bugs/559230
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.
More information about the Ubuntu-server-bugs
mailing list