[Bug 1900690] Re: [Ubuntu 20.04] ceph: messages, mds: Fix decoding of enum types on big-endian systems
Frank Heimes
1900690 at bugs.launchpad.net
Thu Feb 4 18:07:15 UTC 2021
** Changed in: ubuntu-z-systems
Status: New => Triaged
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is a bug assignee.
https://bugs.launchpad.net/bugs/1900690
Title:
[Ubuntu 20.04] ceph: messages,mds: Fix decoding of enum types on big-
endian systems
Status in Ubuntu on IBM z Systems:
Triaged
Status in ceph package in Ubuntu:
Triaged
Status in ceph source package in Focal:
Triaged
Status in ceph source package in Groovy:
Triaged
Bug description:
How to reproduce:
On the initial installation, Z cluster had 1 monitor node, 3 OSDs, 1 MDS and 1 MGR. Inorder to form a quorum, 2 more nodes have been added as monitor nodes which are OSDs already.
The Z cluster then had 3 monitor nodes of which 2 are both OSDs and Monitors.
However, at some point in time during the stress-ng run, the monitor
daemon crashed repeatedly on the cluster back to back. The crash
stopped only after removing both the monitor nodes which are OSDs from
the quorum and then the cluster remained stable.
Topology:
root at m8330013:~# ceph node ls all
{
"mon": {
"m8330013": [
"m8330013"
],
"m8330014": [
"m8330014"
],
"m8330015": [
"m8330015"
]
},
"osd": {
"m8330014": [
0
],
"m8330015": [
1
],
"m8330016": [
2
]
},
"mds": {
"m8330013": [
"m8330013"
]
},
"mgr": {
"m8330013": [
"m8330013"
],
"m8330015": [
"m8330015"
]
}
}
root at m8330013:~#
The below job file runs each filesystem stressor sequentially one per
CPU for 5 minutes and the shows the cumulative user and system time of
all the processes at the end of the stress run.
Stress-ng Job file :
run sequential
metrics
verbose
timeout 5m
times
timestamp
#0 means 1 stressor per CPU
access 0
bind-mount 0
chdir 0
chmod 0
chown 0
copy-file 0
dentry 0
dir 0
dirdeep 0
dnotify 0
dup 0
eventfd 0
fallocate 0
fanotify 0
fcntl 0
fiemap 0
file-ioctl 0
filename 0
flock 0
fstat 0
getdent 0
handle 0
inode-flags 0
inotify 0
io 0
iomix 0
ioprio 0
lease 0
link 0
locka 0
lockf 0
lockofd 0
mknod 0
open 0
procfs 0
rename 0
symlink 0
sync-file 0
utime 0
xattr 0
Command for Execution:
stress-ng --job <job_file> --temp-path <cephfs_mountpoint> --log-file
<log_file>
A proposed fixup sent to upstream:
https://github.com/ceph/ceph/pull/36697
As mentioned above, the fix for this issue landed upstream at PR:
https://github.com/ceph/ceph/pull/36697
which was backported to Octopus (15.2.x) release at PR:
https://github.com/ceph/ceph/pull/36813
This backported patch seems to be applied cleanly in ceph-15.2.3 at
focal-updates git tree at :
https://git.launchpad.net/ubuntu/+source/ceph/log/?h=applied/ubuntu
/focal-updates
Please apply the backported patch to this tree. Thanks.
Please be aware that upstream's backport patch
https://github.com/ceph/ceph/pull/36813 merged 2 patches in master
branch together:
https://github.com/ceph/ceph/pull/35920
https://github.com/ceph/ceph/pull/36697
which we need both.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1900690/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list