[Bug 1863242] Re: [SRU] OOM errors with new kernels on resuming
Launchpad Bug Tracker
1863242 at bugs.launchpad.net
Tue Apr 7 19:23:56 UTC 2020
This bug was fixed in the package ec2-hibinit-agent -
1.0.0-0ubuntu4~18.04.4
---------------
ec2-hibinit-agent (1.0.0-0ubuntu4~18.04.4) bionic; urgency=medium
* debian/hibinit-resume: Add extra steps around swapoff to avoid OOM errors.
Also work around xen-netfront not resuming properly.
Thanks to Francis Ginther for the initial patch (LP: #1863242, #1864041)
-- Balint Reczey <rbalint at ubuntu.com> Mon, 23 Mar 2020 13:03:38 +0100
** Changed in: ec2-hibinit-agent (Ubuntu Bionic)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ec2-hibinit-agent in Ubuntu.
https://bugs.launchpad.net/bugs/1863242
Title:
[SRU] OOM errors with new kernels on resuming
Status in ec2-hibinit-agent package in Ubuntu:
Fix Released
Status in ec2-hibinit-agent source package in Bionic:
Fix Released
Status in ec2-hibinit-agent source package in Eoan:
Fix Committed
Bug description:
[Impact]
* During resuming EC2 instances from hibernation sometimes processes
are killed OOM manager.
[Test Case]
* Set up an EC2 instance to allow hibernation as the stop instance action.
* Start the attached Python script in a screen session to reserve 85% of the memory:
python3 mem-waster-pct.py -p 85
* Log out, hibernate, then resume the instance.
* Observe the Python script still running after resuming
[Regression Potential]
* The fix is setting memory overcommit policy to 'always overcommit'
while removing the swap file. This helps dealing with the shrinking
swap space during the swap removal. There is no expected side effect,
since processes trying to allocate excessive amount of memory would
fail with stricter policies, too.
The fix introduces a potential race condition with processes detecting
the overcommit policy:
The policy used when the hibernation took place is saved shortly after
resuming and it is restored after the swap file is removed. In this
time window other processes detect the policy as 'always overcommit',
despite it may not have been set as such before hibernation and may be
restored to a different policy after removing the swap file. Hitting
this race condition seems to be unlikely and there seem to be no good
way of avoiding it.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1863242/+subscriptions
More information about the foundations-bugs
mailing list