[Bug 2093880] Re: libcephfs: flush the caps release in filesystem sync
Kinson Chan
2093880 at bugs.launchpad.net
Mon Jan 13 16:03:12 UTC 2025
Thank you for the explanation. I see that the flushing caps can take
place at any time, such as during a filesystem sync.
As explained in the linked articles, the caps obtained by
`ceph_try_get_caps` can get lost. To my limited knowledge, the backend
functions might get some caps even though it eventually went into an
error. In such situation, value of `got` turns non-zero and yet `ret`
is negative.
The fix as found on Linux kernel 6.12 is that, when the code reaches the
`out` label, the value of `got` is examined and put back if necessary.
It is only 3 or 4 lines of changes, although I see it will need some
time for QA.
So my question is, whether the fix would be back ported to the kernel /
libcephfs of Ubuntu 22.04 LTS. Thanks in advance for your help.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/2093880
Title:
libcephfs: flush the caps release in filesystem sync
Status in ceph package in Ubuntu:
New
Bug description:
Hi,
The bug is as mentioned in the Ceph upstream: https://tracker.ceph.com/issues/67221
and also Linux upstream: https://github.com/torvalds/linux/commit/ccda9910d8490f4fb067131598e4b2e986faa5a0
Under some situations, the libcephfs forgets the acquired capabilities
and thus gets evicted by the Ceph MDS. I would like to see if a
backport will be available for Ubuntu 22.04 LTS (Jammy) Ceph Quincy
client (17.2.x).
Logs on the client:
Dec 19 04:32:45 *** kernel: libceph: mds0 (1)***:6801 socket closed (con state OPEN)
Dec 19 04:32:46 *** kernel: libceph: mds0 (1)***:6801 session reset
Dec 19 04:32:46 *** kernel: ceph: mds0 closed our session
Dec 19 04:32:46 *** kernel: ceph: mds0 reconnect start
Dec 19 04:32:46 *** kernel: ceph: mds0 reconnect denied
Dec 19 04:32:46 *** kernel: libceph: mds0 (1)***:6801 socket closed (con state V1_CONNECT_MSG)
Dec 19 04:32:47 *** kernel: ceph: mds0 rejected session
Logs on the server:
Dec 19 04:30:17 *** ceph-mds[1408372]: log_channel(cluster) log [WRN] : client.911386 isn't responding to mclientcaps(revoke), ino 0x1004e6bede5 pending pAsLsXsFs issued pAsLsXsFs, sent 240.313055 seconds ago
Dec 19 04:32:44 *** ceph-mds[1408372]: log_channel(cluster) log [INF] : Evicting (and blocklisting) client session 911386 (v1:***:0/362122962)
The Ubuntu client version:
Description: Ubuntu 22.04.5 LTS
Release: 22.04
Package:
libcephfs2/jammy-updates,jammy-security,now 17.2.7-0ubuntu0.22.04.2 amd64 [installed,automatic]
What expected to happen:
* There shall be no 'client isn't responding to ... pending pAsLsXsFs ...' messages, and no eviction.
What happened instead:
* The error appeared and the client is evicted.
Thanks,
Kinson
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2093880/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list