[Bug 1979885] Re: /etc.nfs.conf fails for nfsv4 server / blkmapd dumps core
Launchpad Bug Tracker
1979885 at bugs.launchpad.net
Thu Nov 3 00:46:19 UTC 2022
This bug was fixed in the package nfs-utils - 1:2.6.1-2ubuntu5
---------------
nfs-utils (1:2.6.1-2ubuntu5) lunar; urgency=medium
* d/p/blkmapd-fix-invalid-free.patch: fix blkmapd crash due to invalid
free() (LP: #1979885)
-- Andreas Hasenack <andreas at canonical.com> Fri, 28 Oct 2022 08:26:52
-0300
** Changed in: nfs-utils (Ubuntu)
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/1979885
Title:
/etc.nfs.conf fails for nfsv4 server / blkmapd dumps core
Status in nfs-utils package in Ubuntu:
Fix Released
Status in nfs-utils source package in Jammy:
In Progress
Status in nfs-utils source package in Kinetic:
In Progress
Status in nfs-utils package in Debian:
Confirmed
Bug description:
[ Impact ]
Under certain conditions, blkmapd can crash due to calling free() on a
pointer that wasn't malloc()ed. The reproducer went as far as
isolating it to having LVM Logical Volumes on SCSI disks, but the code
flaw is clear.
The struct bl_serial *serial structure is allocated via
bl_create_scsi_string() which does a malloc for it, but the code later
on was doing a free() on the data element of this structure and only
then on the structure itself. That first free() is incorrect, as the
data element was never malloc()ed separatedly.
This was first brought up by lixiaokeng via
https://www.spinics.net/lists/linux-nfs/msg87598.html, but not
acknowledged back then. The patch selected for this SRU is slightly
simpler and more suited for an SRU.
[ Test Plan ]
Create a VM for the ubuntu release under test. What's important is
that this VM has a SCSI device, not VIRTIO. You can add one after the
VM is created, as it must not be the root disk because we will use it
as an LVM volume group, i.e., all data on it will be erased.
You may have to install the kernel extra modules package for the scsi
device to appear:
sudo apt install linux-modules-extra-$(uname -r)
After a reboot, locate the scsi device. In this example, we will use
/dev/sda.
Partition it:
sudo sgdisk -Z /dev/sda
Create an LVM group and volume:
sudo pvcreate /dev/sda
sudo vgcreate vg0 /dev/sda
sudo lvcreate -ntest -L100M vg0
Install nfs-kernel-server:
sudo apt install nfs-kernel-server
The status of the nfs-blkmap service should already show a failure:
systemctl status nfs-blkmap.service
...
Oct 20 18:12:12 j-blkmapd-crash systemd[1]: nfs-blkmap.service: Main process exited, code=dumped, status=6/ABRT
Oct 20 18:12:12 j-blkmapd-crash systemd[1]: nfs-blkmap.service: Failed with result 'core-dump'.
To confirm, run it interactively:
$ sudo blkmapd -f
blkmapd: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
double free or corruption (out)
Aborted
With the fixed packages, it should be running after install. It can
also be tried out interactively again just to be sure:
sudo systemctl stop nfs-blkmap
sudo blkmapd -f
blkmapd: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
The failure to open the blocklayout file is not a problem in this
case, and is unrelated to the bug this SRU is fixing.
[ Where problems could occur ]
Restarting an NFS server can be tricky: connected clients might experience a "blip" in the service, or even hang in the worst case. Also depending on the NFS version being served (3 or 4), multiple services are involved, and the restart can expose a bug in the ordering in which these services are stopped and come back online.
In terms of the patch and code, it's C code dealing with pointers and
memory allocation. Things can easily go wrong here, and since this is
a daemon, memory leaks can have bigger consequences.
[ Other Info ]
I didn't continue the investigation about other scenarios where this could be happening, or why it did not happen with a VIRTIO device, as the SCSI case was enough to reproduce the problem and show where the bug was.
The previous SRU for nfs-utils
(https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1977745) was
stopped by phasing because it detected
(https://errors.ubuntu.com/?release=Ubuntu%2022.04&package=nfs-
utils&period=week&version=1%3A2.6.1-1ubuntu1.1) the crash from this
bug here during the restart of blkmapd.
[Original Description]
When using the 22.04 /etc/nfs.conf an nfsv4 server fails to operate
It kind of works but some clients fail and try nfsv3 ports
symptoms:
on boot:
× nfs-blkmap.service - pNFS block layout mapping daemon
Loaded: loaded (/lib/systemd/system/nfs-blkmap.service; enabled; vendor preset: enabled)
Active: failed (Result: core-dump) since Sat 2022-06-25 07:14:34 PDT; 27min ago
journalctl --catalog --pager-end --unit=nfs-blkmap.service
Jun 25 07:14:34 c68z blkmapd[2386154]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
on systemctl restart nfs-server.service:
○ rpc-svcgssd.service - RPC security service for NFS server
Loaded: loaded (/lib/systemd/system/rpc-svcgssd.service; static)
Active: inactive (dead) since Fri 2022-06-24 19:07:31 PDT; 12h ago
after boot it was:
● rpc-svcgssd.service - RPC security service for NFS server
Loaded: loaded (/lib/systemd/system/rpc-svcgssd.service; static)
Active: active (running) since Sat 2022-06-25 08:27:27 PDT; 2min 7s ago
Some clients tries to access port 111 which is not used by nfs4 on the
network
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-5.15.0-40-generic 5.15.0-40.43
ProcVersionSignature: Ubuntu 5.15.0-40.43-generic 5.15.35
Uname: Linux 5.15.0-40-generic x86_64
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl icp
ApportVersion: 2.20.11-0ubuntu82.1
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D2', '/dev/snd/pcmC0D10p', '/dev/snd/pcmC0D9p', '/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
CasperMD5CheckResult: unknown
Date: Sat Jun 25 08:37:48 2022
HibernationDevice: RESUME=none
MachineType: Apple Inc. Macmini8,1
ProcEnviron:
SHELL=/bin/bash
LANG=en_US.UTF-8
TERM=screen
PATH=(custom, no user)
ProcFB: 0 i915drmfb
ProcKernelCmdLine: root=ZFS=rpool/ROOT/ubuntu_mc4at7 ro initrd=EFI\hostname\initrd.img
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
linux-restricted-modules-5.15.0-40-generic N/A
linux-backports-modules-5.15.0-40-generic N/A
linux-firmware 20220329.git681281e4-0ubuntu3.2
RfKill:
0: hci0: Bluetooth
Soft blocked: no
Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/24/2022
dmi.bios.release: 0.1
dmi.bios.vendor: Apple Inc.
dmi.bios.version: 1731.120.10.0.0 (iBridge: 19.16.15071.0.0,0)
dmi.board.name: Mac-7BA5B2DFE22DDD8C
dmi.board.vendor: Apple Inc.
dmi.board.version: Macmini8,1
dmi.chassis.type: 9
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-7BA5B2DFE22DDD8C
dmi.modalias: dmi:bvnAppleInc.:bvr1731.120.10.0.0(iBridge19.16.15071.0.0,0):bd04/24/2022:br0.1:svnAppleInc.:pnMacmini8,1:pvr1.0:rvnAppleInc.:rnMac-7BA5B2DFE22DDD8C:rvrMacmini8,1:cvnAppleInc.:ct9:cvrMac-7BA5B2DFE22DDD8C:sku:
dmi.product.family: Mac mini
dmi.product.name: Macmini8,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1979885/+subscriptions
More information about the foundations-bugs
mailing list