[Bug 1864041] Re: xen_netfront devices unresponsive after hibernation/resume
Balint Reczey
balint.reczey at canonical.com
Tue Feb 25 18:39:24 UTC 2020
I'm ready to put the workaround to ec2-hibinit-agent, but it would be
much better if the kernel could be fixed.
** Also affects: ec2-hibinit-agent (Ubuntu)
Importance: Undecided
Status: New
** Also affects: linux-aws (Ubuntu Bionic)
Importance: Undecided
Status: New
** Also affects: ec2-hibinit-agent (Ubuntu Bionic)
Importance: Undecided
Status: New
** Also affects: linux-aws (Ubuntu Focal)
Importance: Undecided
Status: New
** Also affects: ec2-hibinit-agent (Ubuntu Focal)
Importance: Undecided
Status: New
** Also affects: linux-aws (Ubuntu Eoan)
Importance: Undecided
Status: New
** Also affects: ec2-hibinit-agent (Ubuntu Eoan)
Importance: Undecided
Status: New
** Changed in: ec2-hibinit-agent (Ubuntu Focal)
Status: New => In Progress
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ec2-hibinit-agent in Ubuntu.
https://bugs.launchpad.net/bugs/1864041
Title:
xen_netfront devices unresponsive after hibernation/resume
Status in ec2-hibinit-agent package in Ubuntu:
In Progress
Status in linux-aws package in Ubuntu:
New
Status in ec2-hibinit-agent source package in Bionic:
New
Status in linux-aws source package in Bionic:
New
Status in ec2-hibinit-agent source package in Eoan:
New
Status in linux-aws source package in Eoan:
New
Status in ec2-hibinit-agent source package in Focal:
In Progress
Status in linux-aws source package in Focal:
New
Bug description:
The xen_netfront device is sometimes unresponsive after a hibernate
and resume event. This is limited to the c4, c5, m4, m5, r4, r5
instance families, all of which are xen based, and support
hibernation.
When the issue occurrs, the instance is inaccessible without a full
restart. Debugging by running a process which outputs regularly to the
serial console shows that the instance is still running.
A workaround is to build the xen_netfront module separately and
restart the module and networking during the resume handler. For
example:
modprobe -r xen_netfront
modprobe xen_netfront
systemctl restart systemd-networkd
With this workaround in place, the unresponsive issue is no longer
observed.
To reproduce this problem:
1) Launch an c4, c5, m4, m5, r4, r5 instance type with a 5.0 or 5.3 kernel with on-demand hibernation support enabled.
2) Start a long-running process which generates messages to the serial console
3) Begin observing these messages on the console (using the AWS UI or CLI to grab a screenshot).
4) Suspend and resume the instance, continuing to refresh the console screenshot.
5) The screenshot should continue to show updates even if ssh access is no longer working.
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
DistroRelease: Ubuntu 18.04
Ec2AMI: ami-0edf3b95e26a682df
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-west-2a
Ec2InstanceType: m4.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Package: linux-aws 4.15.0.1058.59
PackageArchitecture: amd64
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=C.UTF-8
SHELL=/bin/bash
ProcVersionSignature: User Name 5.0.0-1025.28-aws 5.0.21
Tags: bionic ec2-images
Uname: Linux 5.0.0-1025-aws x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video
_MarkForUpload: True
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ec2-hibinit-agent/+bug/1864041/+subscriptions
More information about the foundations-bugs
mailing list