[Bug 1878548] Re: There are cases when masakari-hostmonitor will recognize online nodes as offline and send (in)appropriate notifications to Masakari
Brian Murray
1878548 at bugs.launchpad.net
Tue Dec 6 22:12:36 UTC 2022
Hello Daisuke, or anyone else affected,
Accepted masakari-monitors into focal-proposed. The package will build
now and be available at https://launchpad.net/ubuntu/+source/masakari-
monitors/9.0.0-0ubuntu0.20.04.2 in a few hours, and then in the
-proposed repository.
Please help us by testing this new package. See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed. Your feedback will aid us getting this
update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
focal to verification-done-focal. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-focal. In either case, without details of your testing we will
not be able to proceed.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance for helping!
N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.
** Changed in: masakari-monitors (Ubuntu Focal)
Status: Triaged => Fix Committed
** Tags added: verification-needed verification-needed-focal
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1878548
Title:
There are cases when masakari-hostmonitor will recognize online nodes
as offline and send (in)appropriate notifications to Masakari
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive ussuri series:
Triaged
Status in Ubuntu Cloud Archive victoria series:
Fix Committed
Status in Ubuntu Cloud Archive wallaby series:
Fix Committed
Status in masakari-monitors:
Fix Released
Status in masakari-monitors ussuri series:
Fix Released
Status in masakari-monitors victoria series:
Fix Released
Status in masakari-monitors wallaby series:
Fix Released
Status in masakari-monitors xena series:
Fix Released
Status in masakari-monitors package in Ubuntu:
Fix Released
Status in masakari-monitors source package in Focal:
Fix Committed
Bug description:
[Issue]
ComputeNodes are managed by pacemaker_remote in my environment.
When one ComputeNode is isolated in the network, masakari-hostmonitors on the other ComputeNodes will send failure notification about the isolated ComputeNode to masakari-api.
At that time, the isolated masakari-hostomonitor will recognize other ComputeNodes as offline. So it sends failure notification about online ComputeNodes.
As a result, masakari-engine runs the recovery procedure to online ComputeNodes.
[Cause]
The current masakari-hostmonitor can't determine whether or not it is isolated in the network if ComputeNodes are managed by pacemaker_remote.
masakari-hostmonitor with pacemaker(not remote) will wait until it is killed if it is isolated in the network. It is implemented in the following code.
<https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L398-L402>
But masakari-hostmonitor with pacemaker_remote won't determine if it is isolated.
<https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L93-L95>
[Solution]
The ComputeNode managed by pacemaker_remote should determine recognize itself as offline when it is isolated.
The state monitoring process should be skipped in that case.
See comment #11 for how yoctozepto managed to reproduce something
similar to the described.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1878548/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list