[Bug 1986747] Re: [quincy] invalid osd_class_dir blocks rados client connections
Chris MacNaughton
1986747 at bugs.launchpad.net
Tue Nov 22 16:08:48 UTC 2022
This bug was fixed in the package ceph - 17.2.0-0ubuntu0.22.04.2~cloud0
---------------
ceph (17.2.0-0ubuntu0.22.04.2~cloud0) focal-yoga; urgency=medium
.
* New update for the Ubuntu Cloud Archive.
.
ceph (17.2.0-0ubuntu0.22.04.2) jammy; urgency=medium
.
* d/p/lp1986747-fix-osd-class-dir.patch: Partially revert upstream
change that breaks classpath loading (LP: #1986747).
** Changed in: cloud-archive
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1986747
Title:
[quincy] invalid osd_class_dir blocks rados client connections
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive yoga series:
Fix Released
Status in ceph package in Ubuntu:
Fix Released
Status in ceph source package in Jammy:
Fix Released
Status in ceph source package in Kinetic:
Fix Released
Bug description:
[Impact]
A Ceph cluster that has upgraded from before 19.04 will break upon
upgrade to Ceph Quincy (Jammy and Kinetic).
Ubuntu packaging is configuring `osd_class_dir` with a relative path
`CMAKE_INSTALL_LIBDIR` instead of the required absolute path
`CMAKE_INSTALL_FULL_LIBDIR` [0].
The default value for `osd_class_dir` changed in Quincy, starting with
v17.1.0 [1].
The ceph-osd service relies on the `osd_class_dir` path to find and load class libraries that extend RADOS [2]. When this is set incorrectly, RADOS clients fail with repeated "Operation not supported" errors:
```
2022-08-16T17:42:15.044+0000 7fe375685e40 0 rgw main: ERROR: failed reading data (obj=default.rgw.log:bucket.sync-target-hints.), r=-95
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to read targets index for bucket=:[]) r=-95
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to initialize bucket sync policy handler: get_bucket_sync_hints() on bucket=-- returned r=-95
2022-08-16T17:42:15.048+0000 7fe375685e40 -1 rgw main: ERROR: could not initialize zone policy handler for zone=default
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to start notify service ((95) Operation not supported
2022-08-16T17:42:15.048+0000 7fe375685e40 0 rgw main: ERROR: failed to init services (ret=(95) Operation not supported)
```
The ceph-osd service will also report `_load_class` errors:
```
2022-08-16T19:05:55.562+0000 7f4770ff9700 0 _load_class could not stat class lib/x86_64-linux-gnu/rados-classes/libcls_rbd.so: (2) No such file or directory
```
Admins can resolve this issue by manually setting `osd_class_dir` to the correct value. Run the following command on a ceph-mon:
```
sudo ceph config set global osd_class_dir /usr/lib/x86_64-linux-gnu/rados-classes
```
Then restart all ceph-osd services to pick up the new `osd_class_dir`
location.
[0] https://cmake.org/cmake/help/v3.24/module/GNUInstallDirs.html#result-variables
[1] https://github.com/ceph/ceph/commit/3bee4b02611459b9ae949cebf5967e4d83ef55de
[2] https://docs.ceph.com/en/latest/dev/osd-class-path/
[Test Plan]
1. Install Ceph at Bionic
2. Upgrade through to Jammy
a. Confirm that client usage is broken
3. Upgrade to Jammy-proposed
a. Confirm that client usage works again
In addition to client activity, it can be confirmed that the OSDs
don't have error logs about failing to load classes.
[Where problems could occur]
Problems could occur as a result of library paths changing, so Ceph
functionality should be verified. This will be done with functional
tests of Ceph using the Ceph Juju charms.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1986747/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list