[Bug 1599636] Re: pause failing on HA deployment: haproxy is running
James Page
james.page at ubuntu.com
Tue Jul 12 08:30:44 UTC 2016
Pausing the hacluster unit first does disable the haproxy process
running on the unit; pausing the principle will not pause haproxy, as
its directly managed by pacemaker, so will get restarted as soon as its
shutdown using the normal init script.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to cinder in Juju Charms Collection.
Matching subscriptions: charm-bugs
https://bugs.launchpad.net/bugs/1599636
Title:
pause failing on HA deployment: haproxy is running
Status in ceilometer package in Juju Charms Collection:
Triaged
Status in ceph-radosgw package in Juju Charms Collection:
Triaged
Status in cinder package in Juju Charms Collection:
Triaged
Status in glance package in Juju Charms Collection:
Triaged
Status in keystone package in Juju Charms Collection:
Triaged
Status in neutron-api package in Juju Charms Collection:
Triaged
Status in nova-cloud-controller package in Juju Charms Collection:
Triaged
Status in openstack-dashboard package in Juju Charms Collection:
Triaged
Bug description:
I have an HA mitaka cloud (deployed with autopilot), and am checking
the pause/resume actions of its units.
When trying to "action pause" the units of most services that use
hacluster (keystone, cinder, neutron-api, ceilometer, nova-cloud-
controller, ceph-radosgw and less often openstack-dashboard), the
action fails *most of the time* because haproxy is running. "juju
action pause" tries to stop haproxy service, but once
pacemaker/corosync detects haproxy isn't running, the service is
restarted. haproxy is never really stopped and the action fails.
Keystone example:
$ juju action fetch 783e4ee0-a498-42e3-8448-45ac26f6a847
message: 'Couldn''t pause: Services should be paused but these services running: haproxy, these ports which should be closed, but are open: 5000, 35357, Paused. Use ''resume'' action to resume normal service.'
status: failed
juju logs excerpt for the paused unit:
https://pastebin.canonical.com/160484/
juju status of the whole environment:
https://pastebin.canonical.com/160483/
/var/log/syslog excerpt right after juju action pause is issued:
https://pastebin.canonical.com/160379/
The actions for these services sometimes work, but vast majority
attempts fail. This could indicate that something incidental is been
relied upon (e.g. assuming network is "fast" enough that races aren't
an issue).
Output of a script that pauses and resumes one unit per service to
check the behavior: https://pastebin.canonical.com/160490/. Notice
that neutron-api despite failing the action reports the unit as
successfully paused shortly after.
To manage notifications about this bug go to:
https://bugs.launchpad.net/charms/+source/ceilometer/+bug/1599636/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list