[Bug 1715254] Re: nova-novncproxy process gets wedged, requiring kill -HUP
Seyeong Kim
seyeong.kim at canonical.com
Mon Oct 30 07:08:04 UTC 2017
Hello Corey,
I've tested Trusty & UCA Icehouse.
However, I couldn't reproduce this issue.
msgs in logs are different to kilo, mitaka, xenial
There is no 'Reaing zombies, active child count is'.
There are a lot of them on kilo, mitaka, xenial
I saw jame's latest commit which is patch for multiprocessing
but it seems not working on trusty, uca icehouse ( not sure 100% )
--
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1715254
Title:
nova-novncproxy process gets wedged, requiring kill -HUP
Status in OpenStack nova-cloud-controller charm:
Invalid
Status in Ubuntu Cloud Archive:
Invalid
Status in Ubuntu Cloud Archive icehouse series:
Triaged
Status in Ubuntu Cloud Archive kilo series:
Triaged
Status in Ubuntu Cloud Archive mitaka series:
Triaged
Status in websockify package in Ubuntu:
Invalid
Status in websockify source package in Trusty:
Triaged
Status in websockify source package in Xenial:
Triaged
Bug description:
[Impact]
affected
- UCA Mitaka, Kilo
- Xenial
not affected
- UCA Icehouse
- Trusty
( log symptom is different, there is no reaing(which is errata) zombie... etc)
When number of connections are many or frequently reconnecting to
console, nova-novncproxy daemon is stuck because websockify is hang.
[Test case]
1. Deploy openstack
2. Creating instances
3. open console in browser with auto refresh extension ( set 5 seconds )
4. after several hours connection rejected
[Regression Potential]
Components that using websockify, escpecially nova-novncproxy, will be
affected by this patch. However, After upgrading this and refreshing
test above mentioned for 2 days without restarting any services, no
hang happens. I tested this test in my local simple environment, so
need to be considered possibility in different circumstances.
[Others]
related commits
- https://github.com/novnc/websockify/pull/226
- https://github.com/novnc/websockify/pull/219
[Original Description]
Users reported they were unable to connect to instance consoles via
either Horizon or direct URL. Upon investigation we found errors
suggesting the address and port were in use:
2017-08-23 14:51:56.248 1355081 INFO nova.console.websocketproxy [-] WebSocket server settings:
2017-08-23 14:51:56.248 1355081 INFO nova.console.websocketproxy [-] - Listen on 0.0.0.0:6080
2017-08-23 14:51:56.248 1355081 INFO nova.console.websocketproxy [-] - Flash security policy server
2017-08-23 14:51:56.248 1355081 INFO nova.console.websocketproxy [-] - Web server (no directory listings). Web root: /usr/share/novnc
2017-08-23 14:51:56.248 1355081 INFO nova.console.websocketproxy [-] - No SSL/TLS support (no cert file)
2017-08-23 14:51:56.249 1355081 CRITICAL nova [-] error: [Errno 98] Address already in use
2017-08-23 14:51:56.249 1355081 ERROR nova Traceback (most recent call last):
2017-08-23 14:51:56.249 1355081 ERROR nova File "/usr/bin/nova-novncproxy", line 10, in <module>
2017-08-23 14:51:56.249 1355081 ERROR nova sys.exit(main())
2017-08-23 14:51:56.249 1355081 ERROR nova File "/usr/lib/python2.7/dist-packages/nova/cmd/novncproxy.py", line 41, in main
2017-08-23 14:51:56.249 1355081 ERROR nova port=CONF.vnc.novncproxy_port)
2017-08-23 14:51:56.249 1355081 ERROR nova File "/usr/lib/python2.7/dist-packages/nova/cmd/baseproxy.py", line 73, in proxy
2017-08-23 14:51:56.249 1355081 ERROR nova RequestHandlerClass=websocketproxy.NovaProxyRequestHandler
2017-08-23 14:51:56.249 1355081 ERROR nova File "/usr/lib/python2.7/dist-packages/websockify/websocket.py", line 909, in start_server
2017-08-23 14:51:56.249 1355081 ERROR nova tcp_keepintvl=self.tcp_keepintvl)
2017-08-23 14:51:56.249 1355081 ERROR nova File "/usr/lib/python2.7/dist-packages/websockify/websocket.py", line 698, in socket
2017-08-23 14:51:56.249 1355081 ERROR nova sock.bind(addrs[0][4])
2017-08-23 14:51:56.249 1355081 ERROR nova File "/usr/lib/python2.7/socket.py", line 224, in meth
2017-08-23 14:51:56.249 1355081 ERROR nova return getattr(self._sock,name)(*args)
2017-08-23 14:51:56.249 1355081 ERROR nova error: [Errno 98] Address already in use
2017-08-23 14:51:56.249 1355081 ERROR nova
This lead us to the discovery of a stuck nova-novncproxy process after
stopping the service. Once we sent a kill -HUP to that process, we
were able to start the nova-novncproxy and restore service to the
users.
This was not the first time we have had to restart nova-novncproxy
services after users reported that were unable to connect with VNC.
This time, as well as at least 2 other times, we have seen the
following errors in the nova-novncproxy.log during the time frame of
the issue:
gaierror: [Errno -8] Servname not supported for ai_socktype
which seems to correspond to a log entries for connection strings with
an invalid port ('port': u'-1'). As well as a bunch of:
error: [Errno 104] Connection reset by peer
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1715254/+subscriptions
More information about the Ubuntu-sponsors
mailing list