Server load too high when using qemu-img

Tue Feb 1 09:35:29 UTC 2011

I have long standing performance problems on Lucid when handling large files.

I notice this on several servers, but here is a detailed example of a scenario 
I encountered yesterday.

Server (stilgar) is a Quad-core with 8 GB ram. The server has 3 disks. 1 Disk 
contains the operating system. The other two are mdadm RAID0 with LVM. I need 
to recreate the RAID manually[1] on most boots, but otherwise it is working 
fine.
(Before there are any heart attacks from reading 'raid0': the data on it is 
NOT important, and only meant for testing.)
The server runs 4 virtual machines (KVM).
- 2 Lucid servers on qcow, residing on the local (non-raid) disk.
- 1 Lucid server on a fstab mounted NFS4 share.
- 1 Windows desktop on a logical volume.

I have an NFS mounted backup disk. When I restore the Windows image from the 
backup (60GB), I encounter bug 658131[2]. All running virtual machines will 
start showing errors like in bug 522014[3] in their logs 
(hung_task_timeout_secs) and services on them will no longer be reachable. The 
load on the server can climb to >30. Libvirt will no longer be able to 
shutdown the virtual machines. Nothing else can be done than a reboot of the 
whole machine.

From the bug report, it looks like this might be NFS related, but I'm not 
convinced. If I copy the image first and then restore it, the load also climbs 
insanely high and the virtual machines will be on the verge of crashing. 
Services will be temporaraly unavailable.

The software used is qemu-img or dd. In all cases I'm running the commands 
with 'ionice -c 3'.

This is only an example. Any high IO (e.g. rsync with large files) can crash 
Lucid servers, but what should I do? Sometimes it is necessary to copy large 
files. That should be something that can be done without taking down the 
entire server. Any thoughts on the matter?

Links:
[1] https://bugs.launchpad.net/bugs/27037
[2] https://bugs.launchpad.net/bugs/658131
[3] https://bugs.launchpad.net/bugs/522014

Example from /var/log/messages (kernel) on the server:

kvm           D 0000000000000000     0  9632      1 0x00000000
 ffff8801a4269ca8 0000000000000086 0000000000015bc0 0000000000015bc0
 ffff8802004fdf38 ffff8801a4269fd8 0000000000015bc0 ffff8802004fdb80
 0000000000015bc0 ffff8801a4269fd8 0000000000015bc0 ffff8802004fdf38
Call Trace:
 [<ffffffff815596b7>] __mutex_lock_slowpath+0x107/0x190
 [<ffffffff815590b3>] mutex_lock+0x23/0x50
 [<ffffffff810f5899>] generic_file_aio_write+0x59/0xe0
 [<ffffffff811d7879>] ext4_file_write+0x39/0xb0
 [<ffffffff81143a8a>] do_sync_write+0xfa/0x140
 [<ffffffff81084380>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81252316>] ? security_file_permission+0x16/0x20
 [<ffffffff81143d88>] vfs_write+0xb8/0x1a0
 [<ffffffff81144722>] sys_pwrite64+0x82/0xa0
 [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
kdmflush      D 0000000000000002     0   396      2 0x00000000
 ffff88022eeb3d10 0000000000000046 0000000000015bc0 0000000000015bc0
 ffff88022f489a98 ffff88022eeb3fd8 0000000000015bc0 ffff88022f4896e0
 0000000000015bc0 ffff88022eeb3fd8 0000000000015bc0 ffff88022f489a98
Call Trace:
 [<ffffffff815589a7>] io_schedule+0x47/0x70
 [<ffffffff81435383>] dm_wait_for_completion+0xa3/0x160
 [<ffffffff81059b90>] ? default_wake_function+0x0/0x20
 [<ffffffff81435d47>] ? __split_and_process_bio+0x127/0x190
 [<ffffffff81435dda>] dm_flush+0x2a/0x70
 [<ffffffff81435e6c>] dm_wq_work+0x4c/0x1c0
 [<ffffffff81435e20>] ? dm_wq_work+0x0/0x1c0
 [<ffffffff8107f7e7>] run_workqueue+0xc7/0x1a0
 [<ffffffff8107f963>] worker_thread+0xa3/0x110
 [<ffffffff81084380>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8107f8c0>] ? worker_thread+0x0/0x110
 [<ffffffff81084006>] kthread+0x96/0xa0
 [<ffffffff810131ea>] child_rip+0xa/0x20
 [<ffffffff81083f70>] ? kthread+0x0/0xa0
 [<ffffffff810131e0>] ? child_rip+0x0/0x20