[Bug 2065848] Re: An ocf:heartbeat:nfsserver resource's stop operation succeeded despite the /var/lib/nfs filesystem failing to unmount.
Athos Ribeiro
2065848 at bugs.launchpad.net
Fri Sep 20 14:11:23 UTC 2024
** Description changed:
[ Impact ]
the nfsserver resource agent unmounts /var/lib/nfs as part of its stop
process. WHen such unmount operation fails and re rest of the stop
operation succeeds, the whole stop operation is reported as being
successful. However, it should fail in such cases, otherwise, this may
hider recovery of other resources.
[ Test Plan ]
Install the relevant packages:
# apt update && apt install -y pacemaker corosync resource-agents nfs-
kernel-server pcs
Create an ocf:heartbeat:nfsserver resource with a nfs_shared_infodir
# pcs resource create nfs-daemon ocf:heartbeat:nfsserver
nfs_shared_infodir=/mnt/nfs_shared_infodir
Hold open /var/lib/nfs:
# touch /var/lib/nfs/testfile
# exec 3>/var/lib/nfs/testfile
Stop the resource:
# pcs resource debug-stop nfs-daemon
- Affected systems should report success when stopping the resource
- (Operation stop for nfs-daemon (ocf:heartbeat:nfsserver) returned: 'ok' (0)).
+ Affected systems should report success when stopping the resource:
- Also check that /var/lib/nfs is still mounted:
+ Operation stop for nfs-daemon (ocf:heartbeat:nfsserver) returned: 'ok'
+ (0)
- # mount | grep /var/lib/nfs
+ Fixed systems should fail to stop the resource:
- Fixed systems should fail to stop the resource.
+ Operation force-stop for nfs-daemon (ocf:heartbeat:nfsserver) returned 1
+ (error: Failed to unmount a bind mount)
[ Where problems could occur ]
This change consists in having the nfs resource agent to return an error
in case specific mounts are not unmounted upon stopping the agent. If
any user custom workflows (erroneously) rely in such behavior (having
the stop action to succeed even when unmounting fails), then this will
break such workflow. In this specific case, we should just point those
users to this bug.
[ Other Info ]
This bug was originally reported (and fixed upstream) through
https://bugzilla.redhat.com/show_bug.cgi?id=1924363. The upstream fix is
available at https://github.com/ClusterLabs/resource-
agents/commit/dc4fc6fb51481e62c763212129e7dbae4cb663fd.
[ Original message ]
The resource "ocf:heartbeat:nfsserver" is considered stopped even the process returned an error:
pacemaker-execd[7831]: notice: nfs_daemon_stop_0[65992] error output [ umount: /var/lib/nfs: target is busy. ]
Beacause it is considered successfully stopped a later unmount of an LVM resource failed:
LVM-activate(LVM_nfs_infodir_LV)[69671]: ERROR: PARTIAL MODE. Incomplete logical volumes will be processed. Logical volume DCSS_VG/nfs_infodir contains a filesystem in use.
cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"
resource-agents-base/jammy-updates,now 1:4.7.0-1ubuntu7.2 all [installed,automatic]
Cluster Resource Agents curated by Ubuntu
resource-agents-common/jammy-updates,now 1:4.7.0-1ubuntu7.2 amd64 [installed,automatic]
Common files used by the Cluster Resource Agents
resource-agents-extra/jammy-updates,now 1:4.7.0-1ubuntu7.2 amd64 [installed]
Cluster Resource Agents
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/2065848
Title:
An ocf:heartbeat:nfsserver resource's stop operation succeeded despite
the /var/lib/nfs filesystem failing to unmount.
Status in nfs-utils package in Ubuntu:
New
Status in resource-agents package in Ubuntu:
Fix Released
Status in resource-agents source package in Jammy:
Triaged
Bug description:
[ Impact ]
the nfsserver resource agent unmounts /var/lib/nfs as part of its stop
process. WHen such unmount operation fails and re rest of the stop
operation succeeds, the whole stop operation is reported as being
successful. However, it should fail in such cases, otherwise, this may
hider recovery of other resources.
[ Test Plan ]
Install the relevant packages:
# apt update && apt install -y pacemaker corosync resource-agents nfs-
kernel-server pcs
Create an ocf:heartbeat:nfsserver resource with a nfs_shared_infodir
# pcs resource create nfs-daemon ocf:heartbeat:nfsserver
nfs_shared_infodir=/mnt/nfs_shared_infodir
Hold open /var/lib/nfs:
# touch /var/lib/nfs/testfile
# exec 3>/var/lib/nfs/testfile
Stop the resource:
# pcs resource debug-stop nfs-daemon
Affected systems should report success when stopping the resource:
Operation stop for nfs-daemon (ocf:heartbeat:nfsserver) returned: 'ok'
(0)
Fixed systems should fail to stop the resource:
Operation force-stop for nfs-daemon (ocf:heartbeat:nfsserver) returned
1 (error: Failed to unmount a bind mount)
[ Where problems could occur ]
This change consists in having the nfs resource agent to return an
error in case specific mounts are not unmounted upon stopping the
agent. If any user custom workflows (erroneously) rely in such
behavior (having the stop action to succeed even when unmounting
fails), then this will break such workflow. In this specific case, we
should just point those users to this bug.
[ Other Info ]
This bug was originally reported (and fixed upstream) through
https://bugzilla.redhat.com/show_bug.cgi?id=1924363. The upstream fix
is available at https://github.com/ClusterLabs/resource-
agents/commit/dc4fc6fb51481e62c763212129e7dbae4cb663fd.
[ Original message ]
The resource "ocf:heartbeat:nfsserver" is considered stopped even the process returned an error:
pacemaker-execd[7831]: notice: nfs_daemon_stop_0[65992] error output [ umount: /var/lib/nfs: target is busy. ]
Beacause it is considered successfully stopped a later unmount of an LVM resource failed:
LVM-activate(LVM_nfs_infodir_LV)[69671]: ERROR: PARTIAL MODE. Incomplete logical volumes will be processed. Logical volume DCSS_VG/nfs_infodir contains a filesystem in use.
cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"
resource-agents-base/jammy-updates,now 1:4.7.0-1ubuntu7.2 all [installed,automatic]
Cluster Resource Agents curated by Ubuntu
resource-agents-common/jammy-updates,now 1:4.7.0-1ubuntu7.2 amd64 [installed,automatic]
Common files used by the Cluster Resource Agents
resource-agents-extra/jammy-updates,now 1:4.7.0-1ubuntu7.2 amd64 [installed]
Cluster Resource Agents
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2065848/+subscriptions
More information about the foundations-bugs
mailing list