[Bug 561210] Re: Writing big files to NFS target causes system lock up

Timo Harmonen 561210 at bugs.launchpad.net
Mon Jul 12 12:17:20 UTC 2010


I can repro this issue quite easily with my setup. I'm running two amd64
kvm guests on amd64 host system with 8GB of memory. Nfs server is
running on the host, and guests heavily rely on it. All systems are up-
to-date, kernel is 2.6.32-23.

So the guests hang when they heavily access nfs mounts, it seems that
write operations are needed. First I used nfs3, then switched to nfs4,
but it didn't really help.

host export:
/srv/mmedia             172.16.0.0/16(rw,nohide,insecure,no_subtree_check,async)

guest fstab mount:
172.16.1.1:/mmedia    /mmedia   nfs4 _netdev,auto 0 0

I have had this issue since upgrading to Lucid, and never had anything
like this with Karmic, where I had exactly the same setup.

dmesg log attached, both from the host and a guest.

One way to repro this is to run a script on the guest that processes
(copies) image files over nfs, this hangs after processing around 20-50
files. System load starts to increase after the script hangs, I have
seen loads way over 200. After this happens, also all other processes
accessing nfs mounts hang. Cannot reboot, have to hard reset the guest.

syslog from the gust:
--------------------------
Jul 12 13:42:14 scotty kernel: [  360.190575] INFO: task perl:4360 blocked for more than 120 seconds.
Jul 12 13:42:14 scotty kernel: [  360.190585] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 12 13:42:14 scotty kernel: [  360.190592] perl          D 0000000000000000     0  4360   4358 0x00000000
Jul 12 13:42:14 scotty kernel: [  360.190605]  ffff8800b02ffc48 0000000000000082 0000000000015bc0 0000000000015bc0
Jul 12 13:42:14 scotty kernel: [  360.190616]  ffff8800ae73c890 ffff8800b02fffd8 0000000000015bc0 ffff8800ae73c4d0
Jul 12 13:42:14 scotty kernel: [  360.190624]  0000000000015bc0 ffff8800b02fffd8 0000000000015bc0 ffff8800ae73c890
Jul 12 13:42:14 scotty kernel: [  360.190633] Call Trace:
Jul 12 13:42:14 scotty kernel: [  360.190729]  [<ffffffffa014a3b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.190788]  [<ffffffff81541357>] io_schedule+0x47/0x70
Jul 12 13:42:14 scotty kernel: [  360.190816]  [<ffffffffa014a3be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.190824]  [<ffffffff81541bbf>] __wait_on_bit+0x5f/0x90
Jul 12 13:42:14 scotty kernel: [  360.190850]  [<ffffffffa014a3b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.190860]  [<ffffffff81541c68>] out_of_line_wait_on_bit+0x78/0x90
Jul 12 13:42:14 scotty kernel: [  360.190905]  [<ffffffff81085470>] ? wake_bit_function+0x0/0x40
Jul 12 13:42:14 scotty kernel: [  360.190931]  [<ffffffffa014a39f>] nfs_wait_on_request+0x2f/0x40 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.190964]  [<ffffffffa014e7df>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.190992]  [<ffffffffa014fc1e>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.191027]  [<ffffffffa0150009>] nfs_write_mapping+0x79/0xb0 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.191060]  [<ffffffff8115f7d0>] ? mntput_no_expire+0x30/0x110
Jul 12 13:42:14 scotty kernel: [  360.191087]  [<ffffffffa0150077>] nfs_wb_all+0x17/0x20 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.191109]  [<ffffffffa013ef9a>] nfs_do_fsync+0x2a/0x60 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.191131]  [<ffffffffa013f1e5>] nfs_file_flush+0x75/0xa0 [nfs]
Jul 12 13:42:14 scotty kernel: [  360.191146]  [<ffffffff8114173c>] filp_close+0x3c/0x90
Jul 12 13:42:14 scotty kernel: [  360.191153]  [<ffffffff81141847>] sys_close+0xb7/0x120
Jul 12 13:42:14 scotty kernel: [  360.191179]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
Jul 12 13:44:14 scotty kernel: [  480.190437] INFO: task perl:4360 blocked for more than 120 seconds.
Jul 12 13:44:14 scotty kernel: [  480.190446] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 12 13:44:14 scotty kernel: [  480.190453] perl          D 0000000000000000     0  4360   4358 0x00000000
Jul 12 13:44:14 scotty kernel: [  480.190466]  ffff8800b02ffc48 0000000000000082 0000000000015bc0 0000000000015bc0
Jul 12 13:44:14 scotty kernel: [  480.190477]  ffff8800ae73c890 ffff8800b02fffd8 0000000000015bc0 ffff8800ae73c4d0
Jul 12 13:44:14 scotty kernel: [  480.190486]  0000000000015bc0 ffff8800b02fffd8 0000000000015bc0 ffff8800ae73c890
Jul 12 13:44:14 scotty kernel: [  480.190495] Call Trace:
Jul 12 13:44:14 scotty kernel: [  480.190534]  [<ffffffffa014a3b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190548]  [<ffffffff81541357>] io_schedule+0x47/0x70
Jul 12 13:44:14 scotty kernel: [  480.190582]  [<ffffffffa014a3be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190591]  [<ffffffff81541bbf>] __wait_on_bit+0x5f/0x90
Jul 12 13:44:14 scotty kernel: [  480.190617]  [<ffffffffa014a3b0>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190626]  [<ffffffff81541c68>] out_of_line_wait_on_bit+0x78/0x90
Jul 12 13:44:14 scotty kernel: [  480.190637]  [<ffffffff81085470>] ? wake_bit_function+0x0/0x40
Jul 12 13:44:14 scotty kernel: [  480.190663]  [<ffffffffa014a39f>] nfs_wait_on_request+0x2f/0x40 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190690]  [<ffffffffa014e7df>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190718]  [<ffffffffa014fc1e>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190745]  [<ffffffffa0150009>] nfs_write_mapping+0x79/0xb0 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190756]  [<ffffffff8115f7d0>] ? mntput_no_expire+0x30/0x110
Jul 12 13:44:14 scotty kernel: [  480.190782]  [<ffffffffa0150077>] nfs_wb_all+0x17/0x20 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190805]  [<ffffffffa013ef9a>] nfs_do_fsync+0x2a/0x60 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190827]  [<ffffffffa013f1e5>] nfs_file_flush+0x75/0xa0 [nfs]
Jul 12 13:44:14 scotty kernel: [  480.190836]  [<ffffffff8114173c>] filp_close+0x3c/0x90
Jul 12 13:44:14 scotty kernel: [  480.190843]  [<ffffffff81141847>] sys_close+0xb7/0x120
Jul 12 13:44:14 scotty kernel: [  480.190852]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b


** Attachment added: "dmesg-host-guest.txt"
   http://launchpadlibrarian.net/51777196/dmesg-host-guest.txt

-- 
Writing big files to NFS target causes system lock up
https://bugs.launchpad.net/bugs/561210
You received this bug notification because you are a member of Kernel
Bugs, which is subscribed to linux in ubuntu.




More information about the kernel-bugs mailing list