[Bug 587340] Re: otherwise live instance goes deaf connection refused

waltc junkwrc at comcast.net
Sat May 29 22:13:39 BST 2010


There is more history and documentation in Question #112157

Effectively I am running the cluster controller of a managednvlan UEC on an instance of Ubuntu 10.4 Desktop. What I had noticed, over time is running instances would go deaf, often times with what looks like valid IP addresses listed in the describe-instances output. Both the public and private IP addresses would be unavailable. 
What's odd is that in the case of an intervening vpn session after the vpn session was closed, the ip endpoints to that cloud instance were removed.

Even if there were a separate dedicated cc would not one lose
connectivity from the client machine? My existing (and previously
existing) network was the 192.168.0.xxx served by my wireless router.
The segment reserved for the cloud instances public IP address was
192.168.3.0->3.50 or some such limited range.

What is avahi and why is it withdrawing the endpoint IPs to the cloud
instance?

More information...consider this:
May 28 18:26:19 cor720 NetworkManager: <info> Maximum Segment Size (MSS): 0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 10.0.0.0/8 Next Hop: 10.0.0.0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.251.0/24 Next Hop: 192.168.251.0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.22.0/24 Next Hop: 192.168.22.0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.23.0/24 Next Hop: 192.168.23.0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.24.0/24 Next Hop: 192.168.24.0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 63.131.134.0/24 Next Hop: 63.131.134.0
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 208.111.81.157/32 Next Hop: 208.111.81.157
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 208.111.81.159/32 Next Hop: 208.111.81.159
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 72.20.25.16/32 Next Hop: 72.20.25.16
May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 209.249.222.54/32 Next Hop: 209.249.222.54
May 28 18:26:19 cor720 NetworkManager: <info> Internal IP4 DNS: 10.50.33.21
May 28 18:26:19 cor720 NetworkManager: <info> Internal IP4 DNS: 10.5.4.1
May 28 18:26:19 cor720 NetworkManager: <info> DNS Domain: 'na.global.ad'
May 28 18:26:19 cor720 NetworkManager: <info> Login Banner:
May 28 18:26:19 cor720 NetworkManager: <info> -----------------------------------------
May 28 18:26:19 cor720 NetworkManager: <info> -----------------------------------------
May 28 18:26:19 cor720 vpnc[3537]: can't open pidfile /var/run/vpnc/pid for writing
May 28 18:26:20 cor720 NetworkManager: <info> VPN connection 'Monster (Maynard)' (IP Config Get) complete.
May 28 18:26:20 cor720 NetworkManager: <info> Policy set 'Monster (Maynard)' (tun0) as default for routing and DNS.
May 28 18:26:20 cor720 vmnetBridge: RTM_NEWROUTE: index:5
May 28 18:26:20 cor720 NetworkManager: <info> VPN plugin state changed: 4
May 28 18:26:20 cor720 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/01ifupdown' exited with error status 1.
May 28 18:27:36 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0
May 28 18:27:36 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0
May 28 18:28:26 cor720 vpnc[3537]: select: Interrupted system call
May 28 18:28:26 cor720 vpnc[3537]: terminated by signal: 15
May 28 18:28:26 cor720 avahi-daemon[1175]: Withdrawing address record for 172.19.1.1 on eth0.
May 28 18:28:26 cor720 avahi-daemon[1175]: Withdrawing address record for 192.168.3.100 on eth0.
May 28 18:28:27 cor720 NetworkManager: <info> Policy set 'Auto eth0' (eth0) as default for routing and DNS.
May 28 18:28:27 cor720 vmnetBridge: RTM_NEWROUTE: index:2

There was a, likely, corresponding loss of signal to the open connection to the instance:
When I say it I tried to log on again.
I tried restarting eucalyptus on the cluster as well as the eucalyptus-cc
ubuntu at ip-172-19-1-2:~$ Write failed: Broken pipe
walt at cor720:~$ ssh -i .euca/mykey.priv ubuntu at 192.168.3.100
ssh: connect to host 192.168.3.100 port 22: Connection timed out
walt at cor720:~$ ssh -i .euca/mykey.priv ubuntu at 192.168.3.100
ssh: connect to host 192.168.3.100 port 22: Connection timed out
walt at cor720:~$

I looked at the tail end of the console log and found this:
I also tried rebooting the instance:

Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.
cloud-init running: Sat, 29 May 2010 01:29:42 +0000. up 9.06 seconds

waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id
  01:29:44 [ 1/100]: url error [timed out]
  01:29:47 [ 2/100]: url error [timed out]
  01:29:50 [ 3/100]: url error [timed out]
  01:29:51 [ 4/100]: url error [[Errno 113] No route to host]
  01:29:54 [ 5/100]: url error [timed out]
  01:29:57 [ 6/100]: url error [timed out]
  01:30:01 [ 7/100]: url error [timed out]
  01:30:05 [ 8/100]: url error [timed out]
  01:30:09 [ 9/100]: url error [timed out]
  01:30:13 [10/100]: url error [timed out]
  01:30:17 [11/100]: url error [timed out]
  01:30:22 [12/100]: url error [timed out]
  01:30:27 [13/100]: url error [timed out]
  01:30:32 [14/100]: url error [timed out]
  01:30:37 [15/100]: url error [timed out]
  01:30:42 [16/100]: url error [timed out]
  01:30:48 [17/100]: url error [timed out]
  01:30:54 [18/100]: url error [timed out]
  01:31:00 [19/100]: url error [[Errno 113] No route to host]
  01:31:06 [20/100]: url error [timed out]
  01:31:12 [21/100]: url error [timed out]
  01:31:19 [22/100]: url error [timed out]

The system log on the cc shows:
May 28 21:23:47 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0
May 28 21:23:47 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0
May 28 21:23:47 cor720 dhcpd: DHCPDISCOVER from d0:0d:30:cf:06:f7 via eth0
May 28 21:23:47 cor720 dhcpd: DHCPOFFER on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0
May 28 21:23:47 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 (169.254.169.254) from d0:0d:30:cf:06:f7 via eth0
May 28 21:23:47 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0
May 28 21:24:16 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0
May 28 21:24:16 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0
May 28 21:27:47 cor720 init: uec-component-listener main process (3949) killed by TERM signal
May 28 21:28:47 cor720 init: uec-component-listener main process (22579) killed by TERM signal
May 28 21:29:45 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0
May 28 21:29:45 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0

The instance, while continuing to show a running state, never shows a reestablished IP address pair
RESERVATION r-4481087B admin default
INSTANCE i-30CF06F7 emi-DEF41072 0.0.0.0 0.0.0.0 running mykey 0 m1.large 1970-01-01T00:00:00.65Z cluster1 eki-F52010F2 eri-0960114A

-- 
otherwise live instance goes deaf connection refused
https://bugs.launchpad.net/bugs/587340
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to eucalyptus in ubuntu.



More information about the Ubuntu-server-bugs mailing list