[Bug 2003530] Re: Rook mgr module crashes due to missing mgr.nfs
Luciano Lo Giudice
2003530 at bugs.launchpad.net
Mon Jul 10 21:21:22 UTC 2023
Here's the test plan for Kinetic:
First, deploy a small ceph cluster. If using Juju, we can use something
like:
'juju add-machine --series="kinetic"'
Add the -proposed archives. For every target machine, there must exist
the following file with the contents:
$ cat /etc/apt/sources.list.d/ubuntu-kinetic-proposed.list
deb http://archive.ubuntu.com/ubuntu kinetic-proposed main multiverse restricted universe
Verify that the version to test is the one that is going to be
installed:
$ apt-cache policy ceph
ceph:
Installed: (none)
Candidate: 17.2.6-0ubuntu0.22.10.1
Version table:
17.2.6-0ubuntu0.22.10.1 500
500 http://archive.ubuntu.com/ubuntu kinetic-proposed/main amd64 Packages
17.2.5-0ubuntu0.22.10.3 500
500 http://nova.clouds.archive.ubuntu.com/ubuntu kinetic-updates/main amd64 Packages
500 http://security.ubuntu.com/ubuntu kinetic-security/main amd64 Packages
17.2.0-0ubuntu4 500
500 http://nova.clouds.archive.ubuntu.com/ubuntu kinetic/main amd64 Packages
Once the ceph cluster has been deployed successfully, we can ssh into
one of the mons and test the rook module.
First, we verify that the rook module is not yet running:
$ sudo ceph mgr module ls
MODULE
balancer on (always on)
crash on (always on)
devicehealth on (always on)
orchestrator on (always on)
pg_autoscaler on (always on)
progress on (always on)
rbd_support on (always on)
status on (always on)
telemetry on (always on)
volumes on (always on)
iostat on
nfs on
restful on
alerts -
influx -
insights -
localpool -
mirroring -
osd_perf_query -
osd_support -
prometheus -
selftest -
snap_schedule -
stats -
telegraf -
test_orchestrator -
zabbix -
Then, we install and enable the module:
$ sudo apt install ceph-mgr-rook
$ sudo ceph mgr module enable rook
Verify that the cluster is healthy:
cluster:
id: c3ab9238-1f66-11ee-9277-31985965425a
health: HEALTH_OK
services:
mon: 3 daemons, quorum juju-233a7d-ceph-kinetic-0,juju-233a7d-ceph-kinetic-2,juju-233a7d-ceph-kinetic-1 (age 4m)
mgr: juju-233a7d-ceph-kinetic-2(active, since 13s), standbys: juju-233a7d-ceph-kinetic-1, juju-233a7d-ceph-kinetic-0
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
Lastly, check that the rook module is up and running:
$ sudo ceph mgr module ls
MODULE
balancer on (always on)
crash on (always on)
devicehealth on (always on)
orchestrator on (always on)
pg_autoscaler on (always on)
progress on (always on)
rbd_support on (always on)
status on (always on)
telemetry on (always on)
volumes on (always on)
iostat on
nfs on
restful on
rook on
alerts -
influx -
insights -
localpool -
mirroring -
osd_perf_query -
osd_support -
prometheus -
selftest -
snap_schedule -
stats -
telegraf -
test_orchestrator -
zabbix -
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2003530
Title:
Rook mgr module crashes due to missing mgr.nfs
Status in ceph package in Ubuntu:
Fix Released
Status in ceph source package in Jammy:
Fix Committed
Status in ceph source package in Kinetic:
Fix Committed
Status in ceph source package in Lunar:
Fix Released
Bug description:
[Impact]
The rook mgr service crashes on installing the ceph-mgr-rook package
(see below traceback from /var/log/syslog). This is due to a missing
ceph mgr package "nfs" which the rook mgr module depends upon.
This makes the rook mgr module unusable which is required for
integrating Ceph with the Rook storage orchestrator.
The proposed patch fixes this by including the nfs mgr package into
the ceph-mgr-modules-core .deb. This is similar as upstream packages
nfs for the ceph mgr system.
Jan 17 16:39:18 devcontainer-269785 bash[247610]: debug 2023-01-17T16:39:18.008+0000 7f930419fdc0 -1 mgr[py] Module not found: 'rook'
Jan 17 16:39:18 devcontainer-269785 bash[247610]: debug 2023-01-17T16:39:18.008+0000 7f930419fdc0 -1 mgr[py] Traceback (most recent call last):
Jan 17 16:39:18 devcontainer-269785 bash[247610]: File "/usr/share/ceph/mgr/rook/__init__.py", line 5, in <module>
Jan 17 16:39:18 devcontainer-269785 bash[247610]: from .module import RookOrchestrator
Jan 17 16:39:18 devcontainer-269785 bash[247610]: File "/usr/share/ceph/mgr/rook/module.py", line 41, in <module>
Jan 17 16:39:18 devcontainer-269785 bash[247610]: from .rook_cluster import RookCluster
Jan 17 16:39:18 devcontainer-269785 bash[247610]: File "/usr/share/ceph/mgr/rook/rook_cluster.py", line 29, in <module>
Jan 17 16:39:18 devcontainer-269785 bash[247610]: from nfs.cluster import create_ganesha_pool
Jan 17 16:39:18 devcontainer-269785 bash[247610]: ModuleNotFoundError: No module named 'nfs'
[Test plan]
The test requires a Ceph cluster. SSH to a system with a running ceph-
mon service.
$ sudo ceph mgr module ls # verify: no rook mgr module
$ sudo apt-get -q install ceph-mgr-rook
$ sudo ceph -s # verify: no crashed modules
$ sudo ceph mgr module ls # verify: rook mgr module present and enabled
[Where problems could occur]
The proposed patch only includes an additional Python package, and
regression potential should be low.
Issues could occur due to packaging bugs, such as missing dependencies
for the nfs mgr package. As the nfs package is currently missing,
there should not be any additional impact due to this.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2003530/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list