[Bug 1315955] Re: nfsd hangs

F0x06 kevin.velickovic at gmail.com
Mon Jun 8 08:00:54 UTC 2015


Same problem for me
Ubuntu version: Ubuntu 14.04.2 LTS
Kernel version: 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Jun  8 17:28:41 server1 kernel: [ 1561.791349] INFO: task nfsd:1986 blocked for more than 120 seconds.
Jun  8 17:28:41 server1 kernel: [ 1561.791366]       Tainted: P           OX 3.13.0-53-generic #89-Ubuntu
Jun  8 17:28:41 server1 kernel: [ 1561.791384] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  8 17:28:41 server1 kernel: [ 1561.791405] nfsd            D ffff88041fb93180     0  1986      2 0x00000000
Jun  8 17:28:41 server1 kernel: [ 1561.791407]  ffff8803f94d18a0 0000000000000046 ffff8800d4588000 ffff8803f94d1fd8
Jun  8 17:28:41 server1 kernel: [ 1561.791409]  0000000000013180 0000000000013180 ffff8800d4588000 ffff880256c1e8f8
Jun  8 17:28:41 server1 kernel: [ 1561.791410]  ffff880256c1e8a8 ffff880256c1e900 ffff880256c1e8d0 0000000000000000
Jun  8 17:28:41 server1 kernel: [ 1561.791412] Call Trace:
Jun  8 17:28:41 server1 kernel: [ 1561.791414]  [<ffffffff81727229>] schedule+0x29/0x70
Jun  8 17:28:41 server1 kernel: [ 1561.791419]  [<ffffffffa007eaf5>] cv_wait_common+0xe5/0x120 [spl]
Jun  8 17:28:41 server1 kernel: [ 1561.791421]  [<ffffffff810ab220>] ? prepare_to_wait_event+0x100/0x100
Jun  8 17:28:41 server1 kernel: [ 1561.791426]  [<ffffffffa007eb45>] __cv_wait+0x15/0x20 [spl]
Jun  8 17:28:41 server1 kernel: [ 1561.791437]  [<ffffffffa0139da3>] dmu_buf_hold_array_by_dnode+0x233/0x570 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791449]  [<ffffffffa013a1bd>] dmu_buf_hold_array+0x5d/0x80 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791461]  [<ffffffffa013ba01>] dmu_read_uio+0x41/0xe0 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791480]  [<ffffffffa01bafbc>] zfs_read+0x14c/0x450 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791492]  [<ffffffffa01385fe>] ? dmu_object_size_from_db+0x5e/0x80 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791511]  [<ffffffffa01d757a>] zpl_aio_read+0xda/0x130 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791513]  [<ffffffff811bd9cc>] do_sync_readv_writev+0x4c/0x80
Jun  8 17:28:41 server1 kernel: [ 1561.791515]  [<ffffffff811bee90>] do_readv_writev+0xb0/0x220
Jun  8 17:28:41 server1 kernel: [ 1561.791534]  [<ffffffffa01bac57>] ? zfs_open+0x87/0x120 [zfs]
Jun  8 17:28:41 server1 kernel: [ 1561.791536]  [<ffffffff813213d3>] ? ima_get_action+0x23/0x30
Jun  8 17:28:41 server1 kernel: [ 1561.791538]  [<ffffffff813206b2>] ? process_measurement+0x82/0x2c0
Jun  8 17:28:41 server1 kernel: [ 1561.791539]  [<ffffffff811bf02d>] vfs_readv+0x2d/0x50
Jun  8 17:28:41 server1 kernel: [ 1561.791543]  [<ffffffffa0401aae>] nfsd_vfs_read.isra.12+0x6e/0x160 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791547]  [<ffffffffa0402da9>] ? nfsd_open+0xb9/0x190 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791551]  [<ffffffffa0403076>] nfsd_read+0x1e6/0x2c0 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791557]  [<ffffffffa040ce4c>] nfsd3_proc_read+0xcc/0x170 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791561]  [<ffffffffa03fdd3b>] nfsd_dispatch+0xbb/0x200 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791568]  [<ffffffffa036262d>] svc_process_common+0x46d/0x6d0 [sunrpc]
Jun  8 17:28:41 server1 kernel: [ 1561.791575]  [<ffffffffa0362997>] svc_process+0x107/0x170 [sunrpc]
Jun  8 17:28:41 server1 kernel: [ 1561.791578]  [<ffffffffa03fd71f>] nfsd+0xbf/0x130 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791582]  [<ffffffffa03fd660>] ? nfsd_destroy+0x80/0x80 [nfsd]
Jun  8 17:28:41 server1 kernel: [ 1561.791583]  [<ffffffff8108b6b2>] kthread+0xd2/0xf0
Jun  8 17:28:41 server1 kernel: [ 1561.791585]  [<ffffffff8108b5e0>] ? kthread_create_on_node+0x1c0/0x1c0
Jun  8 17:28:41 server1 kernel: [ 1561.791586]  [<ffffffff81733868>] ret_from_fork+0x58/0x90
Jun  8 17:28:41 server1 kernel: [ 1561.791587]  [<ffffffff8108b5e0>] ? kthread_create_on_node+0x1c0/0x1c0

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nfs-utils in Ubuntu.
https://bugs.launchpad.net/bugs/1315955

Title:
  nfsd hangs

Status in NFS-Utils - NFS support files common to client and server:
  New
Status in nfs-utils package in Ubuntu:
  Incomplete
Status in nfs-utils source package in Trusty:
  Incomplete

Bug description:
  On a relatively busy NFS server, the system hang on us with the
  following messages:

  May  4 07:53:36 wol-nfs kernel: [487678.715589] INFO: task nfsd:2793 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.715653]       Not tainted 3.13.0-24-generic #46-Ubuntu
  May  4 07:53:36 wol-nfs kernel: [487678.715695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  May  4 07:53:36 wol-nfs kernel: [487678.715790] nfsd            D ffff88023fc14440     0  2793      2 0x00000000
  May  4 07:53:36 wol-nfs kernel: [487678.715800]  ffff88023317fca0 0000000000000002 ffff880233268000 ffff88023317ffd8
  May  4 07:53:36 wol-nfs kernel: [487678.715807]  0000000000014440 0000000000014440 ffff880233268000 ffffffffa03520a0
  May  4 07:53:36 wol-nfs kernel: [487678.715811]  ffffffffa03520a4 ffff880233268000 00000000ffffffff ffffffffa03520a8
  May  4 07:53:36 wol-nfs kernel: [487678.715818] Call Trace:
  May  4 07:53:36 wol-nfs kernel: [487678.715860]  [<ffffffff8171a3a9>] schedule_preempt_disabled+0x29/0x70
  May  4 07:53:36 wol-nfs kernel: [487678.715865]  [<ffffffff8171c215>] __mutex_lock_slowpath+0x135/0x1b0
  May  4 07:53:36 wol-nfs kernel: [487678.715870]  [<ffffffff8171c2af>] mutex_lock+0x1f/0x2f
  May  4 07:53:36 wol-nfs kernel: [487678.715905]  [<ffffffffa033be55>] nfs4_lock_state+0x15/0x20 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.715917]  [<ffffffffa032e858>] nfsd4_open+0xd8/0x8f0 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.715928]  [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.715937]  [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.715961]  [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
  May  4 07:53:36 wol-nfs kernel: [487678.715977]  [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
  May  4 07:53:36 wol-nfs kernel: [487678.715986]  [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.715995]  [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.716004]  [<ffffffff8108b312>] kthread+0xd2/0xf0
  May  4 07:53:36 wol-nfs kernel: [487678.716009]  [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
  May  4 07:53:36 wol-nfs kernel: [487678.716016]  [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
  May  4 07:53:36 wol-nfs kernel: [487678.716020]  [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0

  And many more with the exact same stack trace:

  May  4 07:53:36 wol-nfs kernel: [487678.716025] INFO: task nfsd:2794 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.716500] INFO: task nfsd:2795 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.717166] INFO: task nfsd:2796 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.717657] INFO: task nfsd:2797 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.718150] INFO: task nfsd:2798 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.718743] INFO: task nfsd:2799 blocked for more than 120 seconds.

  Except this one

  May  4 07:53:36 wol-nfs kernel: [487678.719229] INFO: task nfsd:2800 blocked for more than 120 seconds.
  May  4 07:53:36 wol-nfs kernel: [487678.719347]       Not tainted 3.13.0-24-generic #46-Ubuntu
  May  4 07:53:36 wol-nfs kernel: [487678.719605] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  May  4 07:53:36 wol-nfs kernel: [487678.719741] nfsd            D ffff88023fd94440     0  2800      2 0x00000000
  May  4 07:53:36 wol-nfs kernel: [487678.719746]  ffff8800b81f1b40 0000000000000002 ffff88022f96c7d0 ffff8800b81f1fd8
  May  4 07:53:36 wol-nfs kernel: [487678.719751]  0000000000014440 0000000000014440 ffff88022f96c7d0 ffff8800b81f1ca8
  May  4 07:53:36 wol-nfs kernel: [487678.719755]  ffff8800b81f1cb0 7fffffffffffffff ffff88022f96c7d0 ffff8800b81f1c90
  May  4 07:53:36 wol-nfs kernel: [487678.719760] Call Trace:
  May  4 07:53:36 wol-nfs kernel: [487678.719766]  [<ffffffff81719e89>] schedule+0x29/0x70
  May  4 07:53:36 wol-nfs kernel: [487678.719770]  [<ffffffff817190d9>] schedule_timeout+0x239/0x2d0
  May  4 07:53:36 wol-nfs kernel: [487678.719775]  [<ffffffff81719a11>] ? __schedule+0x381/0x7d0
  May  4 07:53:36 wol-nfs kernel: [487678.719781]  [<ffffffff8101b763>] ? native_sched_clock+0x13/0x80
  May  4 07:53:36 wol-nfs kernel: [487678.719786]  [<ffffffff8101b7d9>] ? sched_clock+0x9/0x10
  May  4 07:53:36 wol-nfs kernel: [487678.719791]  [<ffffffff8171a9a6>] wait_for_completion+0xa6/0x160
  May  4 07:53:36 wol-nfs kernel: [487678.719798]  [<ffffffff8109a790>] ? wake_up_state+0x20/0x20
  May  4 07:53:36 wol-nfs kernel: [487678.719804]  [<ffffffff810824ca>] flush_workqueue+0x11a/0x5a0
  May  4 07:53:36 wol-nfs kernel: [487678.719818]  [<ffffffffa0346683>] nfsd4_shutdown_callback+0x73/0x80 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719829]  [<ffffffffa033d37d>] destroy_client+0x18d/0x430 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719840]  [<ffffffffa033e9d6>] nfsd4_setclientid_confirm+0x1e6/0x210 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719849]  [<ffffffffa032f5da>] nfsd4_proc_compound+0x56a/0x7b0 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719857]  [<ffffffffa031bd2b>] nfsd_dispatch+0xbb/0x200 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719872]  [<ffffffffa026a63d>] svc_process_common+0x46d/0x6d0 [sunrpc]
  May  4 07:53:36 wol-nfs kernel: [487678.719885]  [<ffffffffa026a9a7>] svc_process+0x107/0x170 [sunrpc]
  May  4 07:53:36 wol-nfs kernel: [487678.719893]  [<ffffffffa031b71f>] nfsd+0xbf/0x130 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719901]  [<ffffffffa031b660>] ? nfsd_destroy+0x80/0x80 [nfsd]
  May  4 07:53:36 wol-nfs kernel: [487678.719905]  [<ffffffff8108b312>] kthread+0xd2/0xf0
  May  4 07:53:36 wol-nfs kernel: [487678.719909]  [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
  May  4 07:53:36 wol-nfs kernel: [487678.719914]  [<ffffffff8172637c>] ret_from_fork+0x7c/0xb0
  May  4 07:53:36 wol-nfs kernel: [487678.719918]  [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0

  
  It looks like the last thread just hung, keeping a lock and blocking out every single other thread/process of nfsd.

  
  Preceding the crash, there were a few suspicious messages about a CPU soft lockup, with the following stack trace. This may or may not be related. It's days ago though, so it's probably nothing.

  Apr 30 12:45:41 wol-nfs kernel: [159283.910727] BUG: soft lockup - CPU#2 stuck for 22s! [chown:6108]
  Apr 30 12:45:41 wol-nfs kernel: [159283.910928] Call Trace:
  Apr 30 12:45:41 wol-nfs kernel: [159283.910934]  [<ffffffff812085e0>] ? locks_delete_block+0x70/0x80
  Apr 30 12:45:41 wol-nfs kernel: [159283.910937]  [<ffffffff81209f40>] __break_lease+0x350/0x3d0
  Apr 30 12:45:41 wol-nfs kernel: [159283.910940]  [<ffffffff811d5b48>] ? notify_change+0x1a8/0x390
  Apr 30 12:45:41 wol-nfs kernel: [159283.910943]  [<ffffffff811b6767>] chown_common+0x117/0x180
  Apr 30 12:45:41 wol-nfs kernel: [159283.910945]  [<ffffffff811b826f>] SyS_fchownat+0xaf/0x110
  Apr 30 12:45:41 wol-nfs kernel: [159283.910948]  [<ffffffff8172663f>] tracesys+0xe1/0xe6
  Apr 30 12:45:41 wol-nfs kernel: [159283.910949] Code: 39 d0 75 ea b8 01 00 00 00 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 e9 06 00 00 00 66 83 07 02 c3 90 8b 37 f0 66 83 07 02 <f6> 47 02 01 74 f1 55 48 89 e5 e8 31 1b ff ff 5d c3 0f 1f 84 00

  
  The relevant sections of kern.log are in an separate attachment.

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-generic 3.13.0.24.29
  ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
  Uname: Linux 3.13.0-24-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 May  4 23:41 seq
   crw-rw---- 1 root audio 116, 33 May  4 23:41 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  CurrentDmesg:
   [    5.274819] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
   [    5.279871] NFSD: starting 90-second grace period (net ffffffff81cd9b00)
   [    5.518836] init: plymouth-upstart-bridge main process ended, respawning
   [   12.233348] [UFW BLOCK] IN=eth0 OUT= MAC=00:50:56:91:fc:20:00:00:00:00:00:00:08:00 SRC=10.0.0.0 DST=224.0.0.1 LEN=36 TOS=0x00 PREC=0x00 TTL=1 ID=0 PROTO=2
  Date: Mon May  5 00:29:12 2014
  HibernationDevice: RESUME=/dev/mapper/wolnfs--vg-swap_1
  InstallationDate: Installed on 2014-04-20 (14 days ago)
  InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
  IwConfig:
   eth0      no wireless extensions.
   
   lo        no wireless extensions.
  Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
  MachineType: VMware, Inc. VMware Virtual Platform
  PciMultimedia:
   
  ProcFB: 0 svgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/wolnfs--vg-root ro
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-24-generic N/A
   linux-backports-modules-3.13.0-24-generic  N/A
   linux-firmware                             1.127
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 07/30/2013
  dmi.bios.vendor: Phoenix Technologies LTD
  dmi.bios.version: 6.00
  dmi.board.name: 440BX Desktop Reference Platform
  dmi.board.vendor: Intel Corporation
  dmi.board.version: None
  dmi.chassis.asset.tag: No Asset Tag
  dmi.chassis.type: 1
  dmi.chassis.vendor: No Enclosure
  dmi.chassis.version: N/A
  dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvr6.00:bd07/30/2013:svnVMware,Inc.:pnVMwareVirtualPlatform:pvrNone:rvnIntelCorporation:rn440BXDesktopReferencePlatform:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
  dmi.product.name: VMware Virtual Platform
  dmi.product.version: None
  dmi.sys.vendor: VMware, Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nfs-utils/+bug/1315955/+subscriptions



More information about the foundations-bugs mailing list