[Bug 1863242] Re: [SRU] OOM errors with new kernels on resuming

Launchpad Bug Tracker 1863242 at bugs.launchpad.net
Tue Apr 7 19:23:56 UTC 2020


This bug was fixed in the package ec2-hibinit-agent -
1.0.0-0ubuntu4~18.04.4

---------------
ec2-hibinit-agent (1.0.0-0ubuntu4~18.04.4) bionic; urgency=medium

  * debian/hibinit-resume: Add extra steps around swapoff to avoid OOM errors.
    Also work around xen-netfront not resuming properly.
    Thanks to Francis Ginther for the initial patch (LP: #1863242, #1864041)

 -- Balint Reczey <rbalint at ubuntu.com>  Mon, 23 Mar 2020 13:03:38 +0100

** Changed in: ec2-hibinit-agent (Ubuntu Bionic)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ec2-hibinit-agent in Ubuntu.
https://bugs.launchpad.net/bugs/1863242

Title:
  [SRU] OOM errors with new kernels on resuming

Status in ec2-hibinit-agent package in Ubuntu:
  Fix Released
Status in ec2-hibinit-agent source package in Bionic:
  Fix Released
Status in ec2-hibinit-agent source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

   * During resuming EC2 instances from hibernation sometimes processes
  are killed OOM manager.

  [Test Case]

   * Set up an EC2 instance to allow hibernation as the stop instance action.
   * Start the attached Python script in a screen session to reserve 85% of the memory:
    python3 mem-waster-pct.py -p 85

   * Log out, hibernate, then resume the instance.
   * Observe the Python script still running after resuming

  [Regression Potential]

   * The fix is setting memory overcommit policy to 'always overcommit'
  while removing the swap file. This helps dealing with the shrinking
  swap space during the swap removal. There is no expected side effect,
  since processes trying to allocate excessive amount of memory would
  fail with stricter policies, too.

  The fix introduces a potential race condition with processes detecting
  the overcommit policy:

  The policy used when the hibernation took place is saved shortly after
  resuming and it is restored after the swap file is removed. In this
  time window other processes detect the policy as 'always overcommit',
  despite it may not have been set as such before hibernation and may be
  restored to a different policy after removing the swap file. Hitting
  this race condition seems to be unlikely and there seem to be no good
  way of avoiding it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1863242/+subscriptions



More information about the foundations-bugs mailing list