Server load too high when using qemu-img

Thu Feb 3 15:43:15 UTC 2011

On Thursday 03 February 2011 15:01:37 Serge E. Hallyn wrote:
> Quoting Alvin (info at alvin.be):
> > I have long standing performance problems on Lucid when handling large
> > files.
> > 
> > I notice this on several servers, but here is a detailed example of a
> > scenario I encountered yesterday.
> > 
> > Server (stilgar) is a Quad-core with 8 GB ram. The server has 3 disks. 1
> > Disk contains the operating system. The other two are mdadm RAID0 with
> > LVM. I need to recreate the RAID manually[1] on most boots, but
> > otherwise it is working fine.
> > (Before there are any heart attacks from reading 'raid0': the data on it
> > is NOT important, and only meant for testing.)
> > The server runs 4 virtual machines (KVM).
> > - 2 Lucid servers on qcow, residing on the local (non-raid) disk.
> > - 1 Lucid server on a fstab mounted NFS4 share.
> > - 1 Windows desktop on a logical volume.
> > 
> > I have an NFS mounted backup disk. When I restore the Windows image from
> > the backup (60GB), I encounter bug 658131[2]. All running virtual
> > machines will start showing errors like in bug 522014[3] in their logs
> > (hung_task_timeout_secs) and services on them will no longer be
> > reachable. The load on the server can climb to >30. Libvirt will no
> > longer be able to
> 
> Is it possible for you to use CIFS instead of NFS?
> 
> It's been a few years, but when I had my NAS at home I found CIFS far more
> stable and reliable than NFS.

Yes. I know NFS is somewhat neglected in Ubuntu, but why use MS Windows file 
sharing between Linux machines? That makes no sense. NFS is easier to set up. 
In short: I could try CIFS, but in order to exclude the network share from 
this issue I copied the image file locally first. It is true that NFS (maybe 
CIFS too) has an impact on this. The load gets even higher when using it.

> > shutdown the virtual machines. Nothing else can be done than a reboot of
> > the whole machine.
> > 
> > From the bug report, it looks like this might be NFS related, but I'm not
> > convinced. If I copy the image first and then restore it, the load also
> > climbs insanely high and the virtual machines will be on the verge of
> > crashing. Services will be temporaraly unavailable.
> 
> (Not trying to be critical)  What do you expect to happen?  I.e what do you
> think is the bug there?  Is it that ionice seems to be insufficient?  I'm
> asking in particular about the conversion by itself, not the copy, as I
> agree the copy pinning CPU must be a (kernel) bug.

Well, I expect a performance hit, but no hung tasks. Especially when using 
ionice.

> > The software used is qemu-img or dd. In all cases I'm running the
> > commands with 'ionice -c 3'.
> > 
> > This is only an example. Any high IO (e.g. rsync with large files) can
> > crash Lucid servers,
> 
> Over NFS, or any rsync?

Both. In the example, NFS/rsync was not used. I only told that because I've 
had the same trouble when using them on other servers.

> For that matter, rsync tries to be smart and slice and dice the file to
> minimize network traffic.  What about a simple ftp/scp?
> 
> > but what should I do? Sometimes it is necessary to copy large
> > files. That should be something that can be done without taking down the
> > entire server. Any thoughts on the matter?
> 
> It might be worth testing other IO schedulers.
> 
> It also might be worth testing a more current kernel.  The kernel team
> does produce backports of newer kernels to lucid which, while surely not
> officially supported, should work and may fix these issues.

I might try those. I see you found my new bug report[1]. You're on to 
something there! I didn't remove an usb drive, but there are similar troubles 
I did not link to this before:
- mdadm does not auto-assemble [2]
- I have an LVM snapshot present on that system! Even worse, the snapshot is 
100% full and thus corrupt.

Now, I didn't think of the snapshot. The presence of an LVM snapshot is a huge 
IO performance hit, so that explains the extreme load. In my example I was 
reading the raw image from its parent volume.

Because of your comment I also found a blog post[3] about the issue:
"Non-existent Device Mapper Volumes Causing I/O Errors?"

So, I will first contact all users and find a moment to take the server 
offline for some testing. Then, i'll post my findings in the bug report.

Thanks for the tips.

Links:
[1] https://bugs.launchpad.net/bugs/712392
[2] https://bugs.launchpad.net/bugs/27037
[3] http://slated.org/device_mapper_weirdness

-- 
Alvin