[Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs
James Page
1950186 at bugs.launchpad.net
Fri Feb 18 15:45:32 UTC 2022
Discussed with the Nova team and this is a know issue at the moment -
mixing instance types with and with NUMA configuration features such as
hugepages will create this type of issue.
The placement API (which is used for scheduling) does not track
different pagesizes so can't deal with this scenario today.
Feedback indicated that using flavors with explicit configuration to use
small pages might do the trick in terms if triggering the codepath
through the NUMA cell configuration in Nova.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1950186
Title:
Nova doesn't account for hugepages when scheduling VMs
Status in nova package in Ubuntu:
New
Bug description:
Description
===========
When hugepages are enabled on the host it's possible to schedule VMs
using more RAM than available.
On the node with memory usage presented below it was possible to
schedule 6 instances using a total of 140G of memory and a non-
hugepages-enabled flavor. The same machine has 188G of memory in
total, of which 64G were reserved for hugepages. Additional ~4G were
used for housekeeping, OpenStack control plane, etc. This resulted in
overcommitment of roughly 20G.
After running memory intensive operations on the VMs, some of them got
OOM killed.
$ cat /proc/meminfo | egrep "^(Mem|Huge)" # on the compute node
MemTotal: 197784792 kB
MemFree: 115005288 kB
MemAvailable: 116745612 kB
HugePages_Total: 64
HugePages_Free: 64
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 67108864 kB
$ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
+----------------+--------+
| Field | Value |
+----------------+--------+
| free_ram_mb | 29309 |
| memory_mb | 193149 |
| memory_mb_used | 163840 |
+----------------+--------+
$ os host show compute1
+----------+----------------------------------+-----+-----------+---------+
| Host | Project | CPU | Memory MB | Disk GB |
+----------+----------------------------------+-----+-----------+---------+
| compute1 | (total) | 0 | 193149 | 893 |
| compute1 | (used_now) | 72 | 163840 | 460 |
| compute1 | (used_max) | 72 | 147456 | 460 |
| compute1 | some_project_id_was_here | 2 | 4096 | 40 |
| compute1 | another_anonymized_id_here | 70 | 143360 | 420 |
+----------+----------------------------------+-----+-----------+---------+
$ os resource provider inventory list uuid_of_compute1_node
+----------------+------------------+----------+----------+----------+-----------+--------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+--------+
| MEMORY_MB | 1.0 | 1 | 193149 | 16384 | 1 | 193149 |
| DISK_GB | 1.0 | 1 | 893 | 0 | 1 | 893 |
| PCPU | 1.0 | 1 | 72 | 0 | 1 | 72 |
+----------------+------------------+----------+----------+----------+-----------+--------+
Steps to reproduce
==================
1. Reserve a large part of memory for hugepages on the hypervisor.
2. Create VMs using a flavor that uses a lot of memory that isn't backed by hugepages.
3. Start memory intensive operations on the VMs, e.g.:
stress-ng --vm-bytes $(awk '/MemAvailable/{printf "%d", $2 * 0.98;}' < /proc/meminfo)k --vm-keep -m 1
Expected result
===============
Nova should not allow overcommitment and should be able to
differentiate between hugepages and "normal" memory.
Actual result
=============
Overcommitment resulting in OOM kills.
Environment
===========
nova-api-metadata 2:21.2.1-0ubuntu1~cloud0
nova-common 2:21.2.1-0ubuntu1~cloud0
nova-compute 2:21.2.1-0ubuntu1~cloud0
nova-compute-kvm 2:21.2.1-0ubuntu1~cloud0
nova-compute-libvirt 2:21.2.1-0ubuntu1~cloud0
python3-nova 2:21.2.1-0ubuntu1~cloud0
python3-novaclient 2:17.0.0-0ubuntu1~cloud0
OS: Ubuntu 18.04.5 LTS
Hypervisor: libvirt + KVM
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1950186/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list