[SRU][E][aws][PULL] Xen / hibernation: xen-netfront panic + resume hangs

Andrea Righi andrea.righi at canonical.com
Wed Jun 3 09:08:26 UTC 2020


BugLink: https://bugs.launchpad.net/bugs/1881869

[Impact]

During our AWS testing we were able to trigger some hibernation failures
in some Xen instance types.

One problem is a kernel panic in the resume callback of the xen-netfront
driver. A workaround to this problem is to compile the driver as a
module and reload it at resume (we were already doing this reload with
the bionic kernel that had this driver compiled as a module, but for
some reasons eoan and focal had this statically compiled).

Other issues were showing up as hangs on resume, these seem to be
prevented by using the new Xen/hibernation patch set posted by Anchal to
the LKML:
https://lore.kernel.org/lkml/cover.1589926004.git.anchalag@amazon.com/

This new patch set is still being reviewed, but according to our tests
it really seems to fix some of these hangs on resume.

In addition to that we can improve hibernation reliability and
performance even more by applying the updated swapoff optimization patch
(that has been merged upstream).

[Test case]

Create a Xen instance in AWS, hibernate/resume multiple times.

[Fix]

The following set of fixes can be used to improve hibernation
performance and reliability:
 - new Xen/hibernation patch set from the LKML (see link above)
 - config change to compile xen-netfront as a module
 - new swapoff optimization patch

[Regression potential]

The xen-netfront config change and the new swapoff optimization patch
are pretty safe (one is a config change that affects only the
xen-netfront driver, the other is a clean cherry-pick of an upstream
commit).

The new Xen/hibernation update is pretty big and the new patches are
still under review, however according to our tests it really seems to
fix some of the hang issues (it definitely makes things better).
Moreover, all the changes are affecting Xen and they are restricted to
the hibernation/resume code paths, so, in conclusion, the overall
regression potential is minimal.

[See also]

NOTE: the fix mentioned in LP: #1879711 (disable CONFIG_DMA_CMA) was
also applied during our tests and it is also required to make
hibernation stable in Xen.

----------------------------------------------------------------

The following changes since commit 9eff0e6fb43dbee44f457d927812e893be55c2af:

  Revert "UBUNTU SAUCE [aws]: xen: Only restore the ACPI SCI interrupt in xen_restore_pirqs." (2020-06-03 10:45:50 +0200)

are available in the Git repository at:

  . aws-arighi

for you to fetch changes up to 29679695355999b4c7d32d071cd93e9606bb5461:

  UBUNTU SAUCE [aws]: mm: swap: increase default swap readahead size (2020-06-03 10:48:36 +0200)

----------------------------------------------------------------
Anchal Agarwal (4):
      UBUNTU: SAUCE: x86/xen: Introduce new function to map HYPERVISOR_shared_info on Resume
      UBUNTU: SAUCE: genirq: Shutdown irq chips in suspend/resume during hibernation
      UBUNTU: SAUCE: xen: Introduce wrapper for save/restore sched clock offset
      UBUNTU: SAUCE: xen: Update sched clock offset to avoid system instability in hibernation

Andrea Righi (17):
      Revert "UBUNTU SAUCE [aws]: xen: restore pirqs on resume from hibernation."
      Revert "UBUNTU SAUCE [aws]: ACPICA: Enable sleep button on ACPI legacy wake"
      Revert "UBUNTU SAUCE [aws]: mm: swap: improve swap readahead heuristic"
      Revert "UBUNTU SAUCE [aws] PM / hibernate: reduce memory pressure during image writing"
      Revert "UBUNTU: SAUCE [aws] x86/xen: close event channels for PIRQs in system core suspend callback"
      Revert "UBUNTU: SAUCE [aws] xen/events: add xen_shutdown_pirqs helper function"
      Revert "UBUNTU: SAUCE [aws] x86/xen: save and restore steal clock"
      Revert "UBUNTU: SAUCE [aws] xen-time-introduce-xen_-save-restore-_steal_clock"
      Revert "UBUNTU: SAUCE [aws] xen-netfront: add callbacks for PM suspend and hibernation support"
      Revert "UBUNTU: SAUCE [aws] x86/xen: add system core suspend and resume callbacks"
      Revert "UBUNTU: SAUCE [aws] x86/xen: Introduce new function to map HYPERVISOR_shared_info on Resume"
      Revert "UBUNTU: SAUCE: xenbus: add freeze/thaw/restore callbacks support"
      Revert "UBUNTU: SAUCE: xen/manage: introduce helper function to know the on-going suspend mode"
      Revert "UBUNTU: SAUCE: xen/manage: keep track of the on-going suspend mode"
      UBUNTU: [Config] aws: compile xen-netfront as module
      mm: swap: properly update readahead statistics in unuse_pte_range()
      UBUNTU SAUCE [aws]: mm: swap: increase default swap readahead size

Juergen Gross (1):
      xen/blkfront: fix ring info addressing

Munehisa Kamata (7):
      UBUNTU: SAUCE: xen/manage: keep track of the on-going suspend mode
      UBUNTU: SAUCE: xenbus: add freeze/thaw/restore callbacks support
      UBUNTU: SAUCE: x86/xen: add system core suspend and resume callbacks
      UBUNTU: SAUCE: xen-blkfront: add callbacks for PM suspend and hibernation
      UBUNTU: SAUCE: xen-netfront: add callbacks for PM suspend and hibernation
      UBUNTU: SAUCE: xen/time: introduce xen_{save,restore}_steal_clock
      UBUNTU: SAUCE: x86/xen: save and restore steal clock

 arch/x86/xen/suspend.c                    |  12 +-
 arch/x86/xen/time.c                       |  15 ++-
 arch/x86/xen/xen-ops.h                    |   2 +
 debian.aws/config/annotations             |   2 +-
 debian.master/config/config.common.ubuntu |   2 +-
 drivers/acpi/acpica/hwsleep.c             |  11 --
 drivers/block/xen-blkfront.c              | 197 +++++++++++++++++++++++-------
 drivers/net/xen-netfront.c                |  21 ++--
 drivers/xen/events/events_base.c          |  18 +--
 drivers/xen/manage.c                      |   8 +-
 drivers/xen/time.c                        |   7 +-
 drivers/xen/xenbus/xenbus_probe.c         |  47 +++----
 include/linux/irq.h                       |   2 +
 include/xen/events.h                      |   2 -
 kernel/irq/chip.c                         |   2 +-
 kernel/irq/internals.h                    |   1 +
 kernel/irq/pm.c                           |  31 +++--
 kernel/power/swap.c                       |  24 +++-
 mm/swap_state.c                           |  60 +++++++--
 mm/swapfile.c                             |  12 +-
 20 files changed, 327 insertions(+), 149 deletions(-)



More information about the kernel-team mailing list