[Bug 1864045] Re: [SRU] Hibernation events sometimes missed on repeated attempts
Balint Reczey
balint.reczey at canonical.com
Mon Mar 2 20:31:43 UTC 2020
I have tried the 'fixed' acpid version and also tried various versions
with 5.0 and 5.3 kernels, but the issue does not seem to be fixed.
I've backported acpid 1:2.0.32-1ubuntu1 in ppa:rbalint/scratch2 which is
practically the same as 2.0.32-1 compiled on Ubuntu, but the second
hibernation attempt still fails.
Package versions used:
linux-aws-wip/5.3.0.1012.13
acpid/1:2.0.32-1ubuntu1~18.04.0~rbalint3 from ppa:rbalint/scratch2
ec2-hibinit-agent/1.0.0-0ubuntu8~18.04.0~rbalint3 from ppa:rbalint/scratch2
Instance type: c4.large
Vanilla Bionic, i.e. kernel 4.15.0-1060-aws and Bionic's acpid
hibernates twice without any issue.
Could you please add detailed reproduction steps that show how the new
acpid was used to fix the issue?
For now it looks like a kernel regression and bisecting the Linux commit
changing the behaviour would be highly useful.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to acpid in Ubuntu.
https://bugs.launchpad.net/bugs/1864045
Title:
[SRU] Hibernation events sometimes missed on repeated attempts
Status in acpid package in Ubuntu:
Confirmed
Status in linux package in Ubuntu:
Incomplete
Status in acpid source package in Bionic:
Incomplete
Status in linux source package in Bionic:
Incomplete
Status in acpid source package in Eoan:
Confirmed
Status in linux source package in Eoan:
Incomplete
Bug description:
When testing hibernation / resume on AWS with 5.0 or 5.3 kernels on
bionic (using acpid 1:2.0.28-1ubuntu1), we sometimes see failure with
repeated attempts. The first attempt will always be triggered, but the
next attempt may not. The result is the agent never triggers the
hibernation process and the instance will be forced to shutdown after
a timeout period.
Two workarounds have been identified. The first is to restart acpid
during the resume handler. The second is to use the latest upstream
acpid (as of Feb 1, 2020). This second workaround indicates there may
be a patch missing in the acpid in bionic (1:2.0.28-1ubuntu1) to work
with the 5.0+ kernels.
To reproduce this problem:
1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel on a bionic image with on-demand hibernation support enabled.
2) Hibernate and resume the instance, ensuring the system is fully resumed afterward and the swap file has been removed.
3) Hibernate and resume another time. The hibernate should be triggered immediately and the instance should become unresponsive as it saves state to disk.
4) Resume the instance, it should come back with the same processes running.
5) Repeat 3) - 4) as necessary.
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
DistroRelease: Ubuntu 18.04
Ec2AMI: ami-0edf3b95e26a682df
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-west-2a
Ec2InstanceType: m4.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Package: acpid 1:2.0.28-1ubuntu1
PackageArchitecture: amd64
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=C.UTF-8
SHELL=/bin/bash
ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21
Tags: bionic ec2-images
Uname: Linux 5.0.0-1025-aws x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video
_MarkForUpload: True
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/acpid/+bug/1864045/+subscriptions
More information about the foundations-bugs
mailing list