[Bug 1864045] Re: [SRU] Hibernation events sometimes missed on repeated attempts
Andrea Righi
andrea.righi at canonical.com
Tue Mar 3 13:52:11 UTC 2020
@rbalint if you can reproduce the problem easily, it would be
interesting to monitor the received ACPI events via acpi_listen.
What I see during my tests is that acpi_listen is always showing the
sleep events, meaning that the kernel receives them correctly at least,
and then the failure happens in the delivery of these sleep events to
the proper user-space daemon (acpid). So my guess is that something
wrong is happening in the communication between kernel and user-space to
deliver these events.
Just to make sure, when you say "the second hibernation attempt still
fails" you mean that the system is still up & running (you can still ssh
on it) and the sleep event is lost / not delivered properly, right?
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to acpid in Ubuntu.
https://bugs.launchpad.net/bugs/1864045
Title:
[SRU] Hibernation events sometimes missed on repeated attempts
Status in acpid package in Ubuntu:
Confirmed
Status in linux package in Ubuntu:
Incomplete
Status in acpid source package in Bionic:
Incomplete
Status in linux source package in Bionic:
Incomplete
Status in acpid source package in Eoan:
Confirmed
Status in linux source package in Eoan:
Incomplete
Bug description:
When testing hibernation / resume on AWS with 5.0 or 5.3 kernels on
bionic (using acpid 1:2.0.28-1ubuntu1), we sometimes see failure with
repeated attempts. The first attempt will always be triggered, but the
next attempt may not. The result is the agent never triggers the
hibernation process and the instance will be forced to shutdown after
a timeout period.
Two workarounds have been identified. The first is to restart acpid
during the resume handler. The second is to use the latest upstream
acpid (as of Feb 1, 2020). This second workaround indicates there may
be a patch missing in the acpid in bionic (1:2.0.28-1ubuntu1) to work
with the 5.0+ kernels.
To reproduce this problem:
1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel on a bionic image with on-demand hibernation support enabled.
2) Hibernate and resume the instance, ensuring the system is fully resumed afterward and the swap file has been removed.
3) Hibernate and resume another time. The hibernate should be triggered immediately and the instance should become unresponsive as it saves state to disk.
4) Resume the instance, it should come back with the same processes running.
5) Repeat 3) - 4) as necessary.
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
DistroRelease: Ubuntu 18.04
Ec2AMI: ami-0edf3b95e26a682df
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-west-2a
Ec2InstanceType: m4.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Package: acpid 1:2.0.28-1ubuntu1
PackageArchitecture: amd64
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=C.UTF-8
SHELL=/bin/bash
ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21
Tags: bionic ec2-images
Uname: Linux 5.0.0-1025-aws x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video
_MarkForUpload: True
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/acpid/+bug/1864045/+subscriptions
More information about the foundations-bugs
mailing list