[Bug 2003530] Re: Rook mgr module crashes due to missing mgr.nfs
Luciano Lo Giudice
2003530 at bugs.launchpad.net
Mon Jul 10 20:24:06 UTC 2023
Hello everyone, and sorry about the noise before. Here's a test plan
that I think should satisfy all the requirements.
First, we create a small ceph cluster. I used juju to add some machines
with the following commands:
'juju add-machine --series=jammy'
Afterwards, we ssh into the target machines and added the -proposed
archives:
$ cat /etc/apt/sources.list.d/ubuntu-jammy-proposed.list
deb http://archive.ubuntu.com/ubuntu jammy-proposed main multiverse restricted universe
We can verify that the 17.2.6 is going to be installed by running:
$ apt-cache policy ceph
ceph:
Installed: 17.2.6-0ubuntu0.22.04.1
Candidate: 17.2.6-0ubuntu0.22.04.1
Version table:
*** 17.2.6-0ubuntu0.22.04.1 500
500 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
100 /var/lib/dpkg/status
17.2.5-0ubuntu0.22.04.3 500
500 http://nova.clouds.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
500 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages
17.1.0-0ubuntu3 500
500 http://nova.clouds.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
With that in place, we deploy a small ceph cluster. I used 3 mons and 3
osd's, but anything should work.
Once the cluster has been deployed we once again ssh into one of the
target machines (in this case, one of the mons).
As a precaution, we can test that ceph is running the proposed package:
$ ceph-mon -v
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
Now, we ensure that rook-mgr isn't installed or running:
$ sudo ceph mgr module ls
MODULE
balancer on (always on)
crash on (always on)
devicehealth on (always on)
orchestrator on (always on)
pg_autoscaler on (always on)
progress on (always on)
rbd_support on (always on)
status on (always on)
telemetry on (always on)
volumes on (always on)
iostat on
nfs on
restful on
alerts -
influx -
insights -
localpool -
mirroring -
osd_perf_query -
osd_support -
prometheus -
selftest -
snap_schedule -
stats -
telegraf -
test_orchestrator -
zabbix -
Then, we install the rook-mgr module by hand:
$ sudo apt install ceph-mgr-rook
Next, we enable the module:
$ sudo ceph mgr module enable rook
We then check that the ceph cluster is in a healthy state and no modules
have crashed by running:
ubuntu at juju-9f12b1-ceph-0:~$ sudo ceph -s
cluster:
id: 026e3f56-1f5e-11ee-bf28-95ba6942eafd
health: HEALTH_OK
services:
mon: 3 daemons, quorum juju-9f12b1-ceph-1,juju-9f12b1-ceph-2,juju-9f12b1-ceph-0 (age 9m)
mgr: juju-9f12b1-ceph-2(active, since 6m), standbys: juju-9f12b1-ceph-1, juju-9f12b1-ceph-0
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
And finally, we check that the rook module is up and running:
$ sudo ceph mgr module ls
MODULE
balancer on (always on)
crash on (always on)
devicehealth on (always on)
orchestrator on (always on)
pg_autoscaler on (always on)
progress on (always on)
rbd_support on (always on)
status on (always on)
telemetry on (always on)
volumes on (always on)
iostat on
nfs on
restful on
rook on
alerts -
influx -
insights -
localpool -
mirroring -
osd_perf_query -
osd_support -
prometheus -
selftest -
snap_schedule -
stats -
telegraf -
test_orchestrator -
zabbix -
The process can be repeated with Kinetic instead, with identical
results.
Again, I apologize for not providing a correct test plan before. Please
let me know if any further verification is needed.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2003530
Title:
Rook mgr module crashes due to missing mgr.nfs
Status in ceph package in Ubuntu:
Fix Released
Status in ceph source package in Jammy:
Fix Committed
Status in ceph source package in Kinetic:
Fix Committed
Status in ceph source package in Lunar:
Fix Released
Bug description:
[Impact]
The rook mgr service crashes on installing the ceph-mgr-rook package
(see below traceback from /var/log/syslog). This is due to a missing
ceph mgr package "nfs" which the rook mgr module depends upon.
This makes the rook mgr module unusable which is required for
integrating Ceph with the Rook storage orchestrator.
The proposed patch fixes this by including the nfs mgr package into
the ceph-mgr-modules-core .deb. This is similar as upstream packages
nfs for the ceph mgr system.
Jan 17 16:39:18 devcontainer-269785 bash[247610]: debug 2023-01-17T16:39:18.008+0000 7f930419fdc0 -1 mgr[py] Module not found: 'rook'
Jan 17 16:39:18 devcontainer-269785 bash[247610]: debug 2023-01-17T16:39:18.008+0000 7f930419fdc0 -1 mgr[py] Traceback (most recent call last):
Jan 17 16:39:18 devcontainer-269785 bash[247610]: File "/usr/share/ceph/mgr/rook/__init__.py", line 5, in <module>
Jan 17 16:39:18 devcontainer-269785 bash[247610]: from .module import RookOrchestrator
Jan 17 16:39:18 devcontainer-269785 bash[247610]: File "/usr/share/ceph/mgr/rook/module.py", line 41, in <module>
Jan 17 16:39:18 devcontainer-269785 bash[247610]: from .rook_cluster import RookCluster
Jan 17 16:39:18 devcontainer-269785 bash[247610]: File "/usr/share/ceph/mgr/rook/rook_cluster.py", line 29, in <module>
Jan 17 16:39:18 devcontainer-269785 bash[247610]: from nfs.cluster import create_ganesha_pool
Jan 17 16:39:18 devcontainer-269785 bash[247610]: ModuleNotFoundError: No module named 'nfs'
[Test plan]
The test requires a Ceph cluster. SSH to a system with a running ceph-
mon service.
$ sudo ceph mgr module ls # verify: no rook mgr module
$ sudo apt-get -q install ceph-mgr-rook
$ sudo ceph -s # verify: no crashed modules
$ sudo ceph mgr module ls # verify: rook mgr module present and enabled
[Where problems could occur]
The proposed patch only includes an additional Python package, and
regression potential should be low.
Issues could occur due to packaging bugs, such as missing dependencies
for the nfs mgr package. As the nfs package is currently missing,
there should not be any additional impact due to this.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2003530/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list