[Bug 1986747] Re: [quincy] invalid osd_class_dir blocks rados client connections
Launchpad Bug Tracker
1986747 at bugs.launchpad.net
Thu Sep 1 15:29:06 UTC 2022
** Merge proposal linked:
https://code.launchpad.net/~chris.macnaughton/ubuntu/+source/ceph/+git/ceph/+merge/429304
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1986747
Title:
[quincy] invalid osd_class_dir blocks rados client connections
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive yoga series:
New
Status in ceph package in Ubuntu:
Confirmed
Status in ceph source package in Jammy:
Confirmed
Status in ceph source package in Kinetic:
Confirmed
Bug description:
[Impact]
A Ceph cluster that has upgraded from before 19.04 will break upon
upgrade to Ceph Quincy (Jammy and Kinetic).
Ubuntu packaging is configuring `osd_class_dir` with a relative path
`CMAKE_INSTALL_LIBDIR` instead of the required absolute path
`CMAKE_INSTALL_FULL_LIBDIR` [0].
The default value for `osd_class_dir` changed in Quincy, starting with
v17.1.0 [1].
The ceph-osd service relies on the `osd_class_dir` path to find and load class libraries that extend RADOS [2]. When this is set incorrectly, RADOS clients fail with repeated "Operation not supported" errors:
```
2022-08-16T17:42:15.044+0000 7fe375685e40 0 rgw main: ERROR: failed reading data (obj=default.rgw.log:bucket.sync-target-hints.), r=-95
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to read targets index for bucket=:[]) r=-95
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to initialize bucket sync policy handler: get_bucket_sync_hints() on bucket=-- returned r=-95
2022-08-16T17:42:15.048+0000 7fe375685e40 -1 rgw main: ERROR: could not initialize zone policy handler for zone=default
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to start notify service ((95) Operation not supported
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to init services (ret=(95) Operation not supported)
```
The ceph-osd service will also report `_load_class` errors:
```
2022-08-16T19:05:55.562+0000 7f4770ff9700 0 _load_class could not stat class lib/x86_64-linux-gnu/rados-classes/libcls_rbd.so: (2) No such file or directory
```
Admins can resolve this issue by manually setting `osd_class_dir` to the correct value. Run the following command on a ceph-mon:
```
sudo ceph config set global osd_class_dir /usr/lib/x86_64-linux-gnu/rados-classes
```
Then restart all ceph-osd services to pick up the new `osd_class_dir`
location.
[0] https://cmake.org/cmake/help/v3.24/module/GNUInstallDirs.html#result-variables
[1] https://github.com/ceph/ceph/commit/3bee4b02611459b9ae949cebf5967e4d83ef55de
[2] https://docs.ceph.com/en/latest/dev/osd-class-path/
[Test Plan]
1. Install Ceph at Bionic
2. Upgrade through to Jammy
a. Confirm that client usage is broken
3. Upgrade to Jammy-proposed
a. Confirm that client usage works again
In addition to client activity, it can be confirmed that the OSDs
don't have error logs about failing to load classes.
[Where problems could occur]
Problems could occur as a result of library paths changing, so Ceph
functionality should be verified. This will be done with functional
tests of Ceph using the Ceph Juju charms.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1986747/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list