[Bug 1315955] Re: nfsd hangs
F0x06
kevin.velickovic at gmail.com
Mon Jun 8 08:00:54 UTC 2015
Same problem for me
Ubuntu version: Ubuntu 14.04.2 LTS
Kernel version: 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Jun 8 17:28:41 server1 kernel: [ 1561.791349] INFO: task nfsd:1986 blocked for more than 120 seconds.
Jun 8 17:28:41 server1 kernel: [ 1561.791366] Tainted: P OX 3.13.0-53-generic #89-Ubuntu
Jun 8 17:28:41 server1 kernel: [ 1561.791384] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 8 17:28:41 server1 kernel: [ 1561.791405] nfsd D ffff88041fb93180 0 1986 2 0x00000000
Jun 8 17:28:41 server1 kernel: [ 1561.791407] ffff8803f94d18a0 0000000000000046 ffff8800d4588000 ffff8803f94d1fd8
Jun 8 17:28:41 server1 kernel: [ 1561.791409] 0000000000013180 0000000000013180 ffff8800d4588000 ffff880256c1e8f8
Jun 8 17:28:41 server1 kernel: [ 1561.791410] ffff880256c1e8a8 ffff880256c1e900 ffff880256c1e8d0 0000000000000000
Jun 8 17:28:41 server1 kernel: [ 1561.791412] Call Trace:
Jun 8 17:28:41 server1 kernel: [ 1561.791414] [<ffffffff81727229>] schedule+0x29/0x70
Jun 8 17:28:41 server1 kernel: [ 1561.791419] [<ffffffffa007eaf5>] cv_wait_common+0xe5/0x120 [spl]
Jun 8 17:28:41 server1 kernel: [ 1561.791421] [<ffffffff810ab220>] ? prepare_to_wait_event+0x100/0x100
Jun 8 17:28:41 server1 kernel: [ 1561.791426] [<ffffffffa007eb45>] __cv_wait+0x15/0x20 [spl]
Jun 8 17:28:41 server1 kernel: [ 1561.791437] [<ffffffffa0139da3>] dmu_buf_hold_array_by_dnode+0x233/0x570 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791449] [<ffffffffa013a1bd>] dmu_buf_hold_array+0x5d/0x80 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791461] [<ffffffffa013ba01>] dmu_read_uio+0x41/0xe0 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791480] [<ffffffffa01bafbc>] zfs_read+0x14c/0x450 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791492] [<ffffffffa01385fe>] ? dmu_object_size_from_db+0x5e/0x80 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791511] [<ffffffffa01d757a>] zpl_aio_read+0xda/0x130 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791513] [<ffffffff811bd9cc>] do_sync_readv_writev+0x4c/0x80
Jun 8 17:28:41 server1 kernel: [ 1561.791515] [<ffffffff811bee90>] do_readv_writev+0xb0/0x220
Jun 8 17:28:41 server1 kernel: [ 1561.791534] [<ffffffffa01bac57>] ? zfs_open+0x87/0x120 [zfs]
Jun 8 17:28:41 server1 kernel: [ 1561.791536] [<ffffffff813213d3>] ? ima_get_action+0x23/0x30
Jun 8 17:28:41 server1 kernel: [ 1561.791538] [<ffffffff813206b2>] ? process_measurement+0x82/0x2c0
Jun 8 17:28:41 server1 kernel: [ 1561.791539] [<ffffffff811bf02d>] vfs_readv+0x2d/0x50
Jun 8 17:28:41 server1 kernel: [ 1561.791543] [<ffffffffa0401aae>] nfsd_vfs_read.isra.12+0x6e/0x160 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791547] [<ffffffffa0402da9>] ? nfsd_open+0xb9/0x190 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791551] [<ffffffffa0403076>] nfsd_read+0x1e6/0x2c0 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791557] [<ffffffffa040ce4c>] nfsd3_proc_read+0xcc/0x170 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791561] [<ffffffffa03fdd3b>] nfsd_dispatch+0xbb/0x200 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791568] [<ffffffffa036262d>] svc_process_common+0x46d/0x6d0 [sunrpc]
Jun 8 17:28:41 server1 kernel: [ 1561.791575] [<ffffffffa0362997>] svc_process+0x107/0x170 [sunrpc]
Jun 8 17:28:41 server1 kernel: [ 1561.791578] [<ffffffffa03fd71f>] nfsd+0xbf/0x130 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791582] [<ffffffffa03fd660>] ? nfsd_destroy+0x80/0x80 [nfsd]
Jun 8 17:28:41 server1 kernel: [ 1561.791583] [<ffffffff8108b6b2>] kthread+0xd2/0xf0
Jun 8 17:28:41 server1 kernel: [ 1561.791585] [<ffffffff8108b5e0>] ? kthread_create_on_node+0x1c0/0x1c0
Jun 8 17:28:41 server1 kernel: [ 1561.791586] [<ffffffff81733868>] ret_from_fork+0x58/0x90
Jun 8 17:28:41 server1 kernel: [ 1561.791587] [<ffffffff8108b5e0>] ? kthread_create_on_node+0x1c0/0x1c0
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/1315955
Title:
nfsd hangs
Status in NFS-Utils - NFS support files common to client and server:
New
Status in nfs-utils package in Ubuntu:
Incomplete
Status in nfs-utils source package in Trusty:
Incomplete
Bug description:
On a relatively busy NFS server, the system hang on us with the
following messages:
May 4 07:53:36 wol-nfs kernel: [487678.715589] INFO: task nfsd:2793 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.715653] Not tainted 3.13.0-24-generic #46-Ubuntu
May 4 07:53:36 wol-nfs kernel: [487678.715695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 4 07:53:36 wol-nfs kernel: [487678.715790] nfsd D ffff88023fc14440 0 2793 2 0x00000000
May 4 07:53:36 wol-nfs kernel: [487678.715800] ffff88023317fca0 0000000000000002 ffff880233268000 ffff88023317ffd8
May 4 07:53:36 wol-nfs kernel: [487678.715807] 0000000000014440 0000000000014440 ffff880233268000 ffffffffa03520a0
May 4 07:53:36 wol-nfs kernel: [487678.715811] ffffffffa03520a4 ffff880233268000 00000000ffffffff ffffffffa03520a8
May 4 07:53:36 wol-nfs kernel: [487678.715818] Call Trace:
May 4 07:53:36 wol-nfs kernel: [487678.715860] [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
May 4 07:53:36 wol-nfs kernel: [487678.715865] [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
May 4 07:53:36 wol-nfs kernel: [487678.715870] [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
May 4 07:53:36 wol-nfs kernel: [487678.715905] [<ffffffffa033be55>] nfs4_lock_state+0x15/0x20 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715917] [<ffffffffa032e858>] nfsd4_open+0xd8/0x8f0 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715928] [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715937] [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715961] [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.715977] [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.715986] [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.715995] [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.716004] [<ffffffff8108b312>] kthread+0xd2/0xf0
May 4 07:53:36 wol-nfs kernel: [487678.716009] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
May 4 07:53:36 wol-nfs kernel: [487678.716016] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
May 4 07:53:36 wol-nfs kernel: [487678.716020] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
And many more with the exact same stack trace:
May 4 07:53:36 wol-nfs kernel: [487678.716025] INFO: task nfsd:2794 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.716500] INFO: task nfsd:2795 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.717166] INFO: task nfsd:2796 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.717657] INFO: task nfsd:2797 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.718150] INFO: task nfsd:2798 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.718743] INFO: task nfsd:2799 blocked for more than 120 seconds.
Except this one
May 4 07:53:36 wol-nfs kernel: [487678.719229] INFO: task nfsd:2800 blocked for more than 120 seconds.
May 4 07:53:36 wol-nfs kernel: [487678.719347] Not tainted 3.13.0-24-generic #46-Ubuntu
May 4 07:53:36 wol-nfs kernel: [487678.719605] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 4 07:53:36 wol-nfs kernel: [487678.719741] nfsd D ffff88023fd94440 0 2800 2 0x00000000
May 4 07:53:36 wol-nfs kernel: [487678.719746] ffff8800b81f1b40 0000000000000002 ffff88022f96c7d0 ffff8800b81f1fd8
May 4 07:53:36 wol-nfs kernel: [487678.719751] 0000000000014440 0000000000014440 ffff88022f96c7d0 ffff8800b81f1ca8
May 4 07:53:36 wol-nfs kernel: [487678.719755] ffff8800b81f1cb0 7fffffffffffffff ffff88022f96c7d0 ffff8800b81f1c90
May 4 07:53:36 wol-nfs kernel: [487678.719760] Call Trace:
May 4 07:53:36 wol-nfs kernel: [487678.719766] [<ffffffff81719e89>] schedule+0x29/0x70
May 4 07:53:36 wol-nfs kernel: [487678.719770] [<ffffffff817190d9>] schedule_timeout+0x239/0x2d0
May 4 07:53:36 wol-nfs kernel: [487678.719775] [<ffffffff81719a11>] ? __schedule+0x381/0x7d0
May 4 07:53:36 wol-nfs kernel: [487678.719781] [<ffffffff8101b763>] ? native_sched_clock+0x13/0x80
May 4 07:53:36 wol-nfs kernel: [487678.719786] [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
May 4 07:53:36 wol-nfs kernel: [487678.719791] [<ffffffff8171a9a6>] wait_for_completion+0xa6/0x160
May 4 07:53:36 wol-nfs kernel: [487678.719798] [<ffffffff8109a790>] ? wake_up_state+0x20/0x20
May 4 07:53:36 wol-nfs kernel: [487678.719804] [<ffffffff810824ca>] flush_workqueue+0x11a/0x5a0
May 4 07:53:36 wol-nfs kernel: [487678.719818] [<ffffffffa0346683>] nfsd4_shutdown_callback+0x73/0x80 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719829] [<ffffffffa033d37d>] destroy_client+0x18d/0x430 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719840] [<ffffffffa033e9d6>] nfsd4_setclientid_confirm+0x1e6/0x210 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719849] [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719857] [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719872] [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.719885] [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
May 4 07:53:36 wol-nfs kernel: [487678.719893] [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719901] [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
May 4 07:53:36 wol-nfs kernel: [487678.719905] [<ffffffff8108b312>] kthread+0xd2/0xf0
May 4 07:53:36 wol-nfs kernel: [487678.719909] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
May 4 07:53:36 wol-nfs kernel: [487678.719914] [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
May 4 07:53:36 wol-nfs kernel: [487678.719918] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
It looks like the last thread just hung, keeping a lock and blocking out every single other thread/process of nfsd.
Preceding the crash, there were a few suspicious messages about a CPU soft lockup, with the following stack trace. This may or may not be related. It's days ago though, so it's probably nothing.
Apr 30 12:45:41 wol-nfs kernel: [159283.910727] BUG: soft lockup - CPU#2 stuck for 22s! [chown:6108]
Apr 30 12:45:41 wol-nfs kernel: [159283.910928] Call Trace:
Apr 30 12:45:41 wol-nfs kernel: [159283.910934] [<ffffffff812085e0>] ? locks_delete_block+0x70/0x80
Apr 30 12:45:41 wol-nfs kernel: [159283.910937] [<ffffffff81209f40>] __break_lease+0x350/0x3d0
Apr 30 12:45:41 wol-nfs kernel: [159283.910940] [<ffffffff811d5b48>] ? notify_change+0x1a8/0x390
Apr 30 12:45:41 wol-nfs kernel: [159283.910943] [<ffffffff811b6767>] chown_common+0x117/0x180
Apr 30 12:45:41 wol-nfs kernel: [159283.910945] [<ffffffff811b826f>] SyS_fchownat+0xaf/0x110
Apr 30 12:45:41 wol-nfs kernel: [159283.910948] [<ffffffff8172663f>] tracesys+0xe1/0xe6
Apr 30 12:45:41 wol-nfs kernel: [159283.910949] Code: 39 d0 75 ea b8 01 00 00 00 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e9 06 00 00 00 66 83 07 02 c3 90 8b 37 f0 66 83 07 02 <f6> 47 02 01 74 f1 55 48 89 e5 e8 31 1b ff ff 5d c3 0f 1f 84 00
The relevant sections of kern.log are in an separate attachment.
ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-generic 3.13.0.24.29
ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 4 23:41 seq
crw-rw---- 1 root audio 116, 33 May 4 23:41 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
CurrentDmesg:
[ 5.274819] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 5.279871] NFSD: starting 90-second grace period (net ffffffff81cd9b00)
[ 5.518836] init: plymouth-upstart-bridge main process ended, respawning
[ 12.233348] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:91:fc:20:00:00:00:00:00:00:08:00 SRC=10.0.0.0 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 PROTO=2
Date: Mon May 5 00:29:12 2014
HibernationDevice: RESUME=/dev/mapper/wolnfs--vg-swap_1
InstallationDate: Installed on 2014-04-20 (14 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
IwConfig:
eth0 no wireless extensions.
lo no wireless extensions.
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: VMware, Inc. VMware Virtual Platform
PciMultimedia:
ProcFB: 0 svgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/wolnfs--vg-root ro
RelatedPackageVersions:
linux-restricted-modules-3.13.0-24-generic N/A
linux-backports-modules-3.13.0-24-generic N/A
linux-firmware 1.127
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/30/2013
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: 6.00
dmi.board.name: 440BX Desktop Reference Platform
dmi.board.vendor: Intel Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd07/30/2013:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: VMware Virtual Platform
dmi.product.version: None
dmi.sys.vendor: VMware, Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nfs-utils/+bug/1315955/+subscriptions
More information about the foundations-bugs
mailing list