Server increasing load due increasing processes in D state

Peter M. Petrakis peter.petrakis at canonical.com
Thu Feb 28 15:18:11 UTC 2013



On 02/28/2013 02:13 AM, Alessandro Tagliapietra wrote:
> Thanks for the reply,
>
>> 1. Reconfigured your VMs to use a single vcpu and cap your available vcpus to 8,
>> even less considering that more than half of your applications are IO bound.
>
>
> you think a correct config would be to have less then 8 cpu used by VM?

It truly depends on your workload, KVM will happily DOS the host
system (HV) if you let it. This isn't a new concept,
over-provisioning guidelines are basically the same for every VM
server out there (Xen, KVM, Vmware, etc). Any documentation on the
subject is interchangeable.

>
>> 2. the software raid can be a choke point for all application progress,
>> virtual or otherwise; it needs CPU time too.
>
>
> Yeah we're going to switch to hardware raid asap we've the money to.
>
> By the way, the other server with the same specs but 17 vcpu on it doesn't have any issue…

It doesn't have a problem until it does, get the scheduler in the
right zone with the right resource contention and VMs may never get
HV CPU time (enough useful time) again; this also makes the reported
VM load useless as there's not enough cycles to update the statistics
to reflect actual state.

> Also can I still run that dump when I'll get processes hanging and let you know? Things wouldn't
  just go slower in case of high load? Not hang without any way to stop them?
>

This is not a software problem, it's a deployment and
administration problem; you are ultimately responsible here.

  
> Best Regards and thanks again
>
> --
>
> Alessandro Tagliapietra
> alexfu.it (http://www.alexfu.it)
>
> Il giorno lunedì 25 febbraio 2013, alle ore 21:38, Peter M. Petrakis ha scritto:
>
>> summary: You're massively over provisioned, +100%
>>
>> resolution:
>> 1. Reconfigured your VMs to use a single vcpu and cap your available vcpus to 8,
>> even less considering that more than half of your applications are IO bound.
>> 2. the software raid can be a choke point for all application progress,
>> virtual or otherwise; it needs CPU time too.
>>
>> remarks:
>> you have a test setup, treat it as such, 19 vcpus is overboard.
>>
>> Peter
>>
>> On 02/25/2013 12:17 PM, Alessandro Tagliapietra wrote:
>>> Sorry for disturbing again,
>>>
>>> after the restart I've seen that I'm unable to ssh to VM since on login they run byobu which now hangs (never did on VM).
>>>
>>> I managed to ctrl-c fast enough to don't start byobu and an strace on it gave me this:
>>>
>>> http://pastebin.com/raw.php?i=KYMbsxKV
>>>
>>> Thanks
>>>
>>> Best Regards
>>>
>>> --
>>>
>>> Alessandro Tagliapietra
>>> alexfu.it (http://www.alexfu.it)
>>>
>>> Il giorno lunedì 25 febbraio 2013, alle ore 17:51, Alessandro Tagliapietra ha scritto:
>>>
>>>> Hi Eduardo
>>>>
>>>> Thank you for the tips.
>>>>
>>>> I'll wait a few days and let you know when this happens again.
>>>>
>>>> About the load, system cpu wasn't more then 10% used from top, io wait was at 2% most of the time.
>>>>
>>>> We've 4 x 2 (HT) cores on the server and a total number of 19 vcpu allocated on VM running on that host.
>>>>
>>>> Vm runs mostly nginx+php-fpm+mysql, one runs also rabbitMQ and a python rabbitMQ consumer.
>>>>
>>>> I'll let you know later then.
>>>>
>>>> Thanks again!
>>>>
>>>> Best
>>>>
>>>> --
>>>>
>>>> Alessandro Tagliapietra
>>>> alexfu.it (http://www.alexfu.it)
>>>>
>>>> Il giorno lunedì 25 febbraio 2013, alle ore 16:44, Eduardo Damato ha scritto:
>>>>
>>>>>
>>>>> Hi Alessandro,
>>>>>
>>>>> Thanks for the information.
>>>>>
>>>>> The sysrq-t that I requested is *only* useful during the problem. Please
>>>>> do that when you encounter the problem again.
>>>>>
>>>>> It may be that you are overcommitting cpus on your system by having many
>>>>> virtual machines running on the nova controller node. This is a
>>>>> completely wild guess, but I would recommend you to look at how many
>>>>> cpus you have and how many virtual machines and if you have any
>>>>> processes in real time or sched FIFO.
>>>>>
>>>>> Cheers,
>>>>> Eduardo.
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> ubuntu-server mailing list
>> ubuntu-server at lists.ubuntu.com (mailto:ubuntu-server at lists.ubuntu.com)
>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
>> More info: https://wiki.ubuntu.com/ServerTeam
>>
>>
>
>
>




More information about the ubuntu-server mailing list