[Bug 559230] Re: multi-machine topology, cannot reach an instance from the CLC

C de-Avillez hggdh2 at gmail.com
Wed Apr 14 01:48:28 BST 2010


I tested it again thie evening, with Dustin monitoring. We again used
lucid-amd64-topo2, and based the installs on the daily server/UEC images
(releases.ubuntu.com is not accessible from tamarind, so I could not use
Beta2).

Installation was uneventful.

I then ran the config_single.yaml test. No problems starting instances,
but still the script (or even I, manually) could not ssh into them,
failing with a timeout.

ran, just for the sake of it (I do not know what is, or is not, blocked
by the firewall(s)) a traceroute agaisnt one of the instances, from
cepedak. It reached marula (the CC), and then starred all.

I then logged in Marula, and ssh-ed to an instance I had manually
started. I *could* reach it (but failed, correctly, on public key -- I
had not added a new key for this run, and the ones used by uec_test.py
had already been revoked).

This is the log of the IRC chat between Dustin and myself:

2010-04-13 18:25:32	hggdh	kirkland: nodes registered, running a single-instance test now
2010-04-13 18:33:02	hggdh	kirkland: test running, log is being written to ~/uec-testing-scripts/resutls/single*
2010-04-13 18:33:09	hggdh	kirkland: on cempedak
2010-04-13 18:33:20	kirkland	hggdh: cool, and you can ssh in?
2010-04-13 18:35:08	hggdh	kirkland: negative
2010-04-13 18:35:19	kirkland	hggdh: cannot ssh in
2010-04-13 18:35:25	hggdh	kirkland: ssh fails on timeout
2010-04-13 18:35:31	hggdh	really sounds like routing
2010-04-13 18:36:18	kirkland	hggdh: interesting
2010-04-13 18:36:25	kirkland	hggdh: okay, put the log somewhere for me to check out
2010-04-13 18:38:27	hggdh	kirkland: k. I just ran one instance by hand, and then tried to ssh into it -- fails with a timeout
2010-04-13 18:39:25	kirkland	hggdh: okay, that's easy to reproduce
2010-04-13 18:39:27	kirkland	hggdh: log?
2010-04-13 18:42:29	hggdh	kirkland: people.c.c/~cerdea/single_test.log.2010-04-13_193218
2010-04-13 18:46:15	kirkland	hggdh: rsync -aP people.canonical.com:~cerdea/single_test.log.2010-04-13_193218 .
2010-04-13 18:46:20	kirkland	hggdh: file not found
2010-04-13 18:47:04	kirkland	hggdh: found it, public_html
2010-04-13 18:47:27	hggdh	heh. one wants it on public_html, another on the root ;-)
2010-04-13 18:49:35	kirkland	hggdh: ls -alF users/admin/uectest-k0.priv
2010-04-13 18:50:07	kirkland	hggdh: and cat that file, make sure it matches -----BEGIN RSA PRIVATE KEY-----
2010-04-13 18:50:33	kirkland	hggdh: is that instance still running?
2010-04-13 18:50:43	kirkland	hggdh: can you telnet to its port 22 ?
2010-04-13 18:51:03	hggdh	kirkland: yes, the instance is still running
2010-04-13 18:52:00	hggdh	kirkland: the priv key seems kosher
2010-04-13 18:52:27	kirkland	hggdh: and telnet ?
2010-04-13 18:53:50	hggdh	kirkland: timeout. Also, a traceroute (FWIW) reaches marula (the CC) and stops there
2010-04-13 18:54:07	kirkland	hggdh: oh, interesting
2010-04-13 18:54:22	kirkland	hggdh: that's got to be it
2010-04-13 18:54:25	hggdh	kirkland: let me try to ssh from marula
2010-04-13 18:54:38	kirkland	hggdh: yeah
2010-04-13 18:54:43	kirkland	hggdh: scp the priv key over
2010-04-13 18:54:47	kirkland	hggdh: and try from there
2010-04-13 18:55:15	hggdh	kirkland: first test -- reachability -- successful
2010-04-13 18:55:21	hggdh	will move the priv key there now
2010-04-13 18:55:21	kirkland	hggdh: ack
2010-04-13 19:00:03	kirkland	hggdh: and?
2010-04-13 19:00:13	hggdh	kirkland: getting permission denied (pub key)
2010-04-13 19:00:30	hggdh	kirkland: but the important piece is that I am *reaching* the instance
2010-04-13 19:00:34	kirkland	hggdh: hrm, odd
2010-04-13 19:00:38	kirkland	hggdh: agreed on that point
2010-04-13 19:00:49	kirkland	hggdh: and you're doing ssh -i ./whatever.priv ubuntu at ip ?
2010-04-13 19:00:58	kirkland	hggdh: and whatever.priv is perm'd 600
2010-04-13 19:01:17	hggdh	kirkland: yes indeed, and will check again
2010-04-13 19:01:26	hggdh	but on wrong permission ssh would bail out
2010-04-13 19:03:41	hggdh	kirkland: and the full command is ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ./uectest-k0.priv  ubuntu at 10.55.55.100
2010-04-13 19:04:07	hggdh	although sort of overworked, I admit
2010-04-13 19:04:24	kirkland	hggdh: hmm, okay
2010-04-13 19:04:35	kirkland	hggdh: it may be that the guest is having trouble getting out
2010-04-13 19:04:48	kirkland	hggdh: or at least to have the key injected
2010-04-13 19:04:58	kirkland	hggdh: okay, add your traceroute findings to that bug
2010-04-13 19:05:11	kirkland	hggdh: and email mathias (cc me) the link to that log
2010-04-13 19:05:33	kirkland	hggdh: i'm reassured that this appears to be a networking issue, but we'll need to get to the bottom of it
2010-04-13 19:05:38	kirkland	hggdh: i gotta run for the night
2010-04-13 19:05:41	kirkland	hggdh: thanks dude!
2010-04-13 19:05:55	hggdh	kirkland: will do, and g'night

-- 
multi-machine topology, cannot reach an instance from the CLC
https://bugs.launchpad.net/bugs/559230
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.



More information about the Ubuntu-server-bugs mailing list