[Bug 1800566] Re: Make reset_devices parameter default for kdump

Guilherme G. Piccoli 1800566 at bugs.launchpad.net
Fri Dec 20 18:02:37 UTC 2019


After some attempt to merge the work needed in LP #1816743 here, we
decided to split the bugs and only work the 'reset_devices' addition
here.

Cheers,


Guilherme

** Summary changed:

- Make reset_devices parameter default for kdump and decouple kdump systemd service from the KDUMP_CMDLINE_APPEND
+ Make reset_devices parameter default for kdump

** Description changed:

  [Impact]
  
  * Kdump does not configure by default the crash kernel to perform a
- device reset by default, by passing the "reset_devices" parameter. Also,
- the systemd service "kdump-tools-dump" is tightly-coupled with
- KDUMP_CMDLINE_APPEND and it shouldn't, to prevent user confusion.
+ device reset by default, by passing the "reset_devices" parameter.
  
  * Kernel has the "reset_devices" parameter that drivers can opt-in, and
  perform special activity in case this parameter is parsed from command-
  line. For example, in kdump kernels it hints the drivers that they are
  booting from a non-healthy condition and needs to issue some form of
  reset to the adapter, like clearing DMA mapping in their firmware for
- example. Users currently (kernel v5.2) are: aacraid, hpsa, ipr,
+ example. Users currently (kernel v5.5-rc2) are: aacraid, hpsa, ipr,
  megaraid_sas, mpt3sas, smartpqi, xenbus.
  
  This should be enabled by default in the kdump config file to be added
  in the kdump kernel command-line for all versions.
  
- * The systemd service"kdump-tools-dump" is responsible for triggering the execution of the makedumpfile tool ultimately. Kdump from Xenial+ releases rely on systemd as their init system, so this service is the way to trigger the kdump mechanism. Currently it is configured as any other parameter in KDUMP_CMDLINE_APPEND, meaning if user decides to change the line they need to remember adding the systemd service back. It's not really a parameter that should be easily manipulated in kdump line, since there's no use for it except to instruct systemd to load kdump; the only 
- reasonable case for removing it is to debug kdump itself.
- 
- 
  [Test Case]
  
- 1) Deploy a Disco VM e.g. with uvt-kvm
+ 1) Deploy a Bionic VM e.g. with uvt-kvm
  2) Install the kdump-tools package
  3) Run `kdump-config test`and check for the 'reset_devices' parameter:
  
  $ kdump-config test
  ...
  kexec command to be used:
    /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" /var/lib/kdump/vmlinuz
- 
- Also, by changing the KDUMP_CMDLINE_APPEND we can see "systemd.unit
- =kdump-tools.service" to be removed.
  
  
  [Regression Potential]
  
  The regression potential is low, since it doesn't need any changes in
  makedumpfile code and we're only adding a parameter on the crash kernel
  command-line. The risks are related with bad behavior with the kernel
  when using "reset_devices", like if the driver has bugs in this path.
  It's considered safer to have the option (and this way prevent problems
  for booting a unhealthy kernel with potential stuck DMAs in the devices)
  than not having it.
- 
- Regarding the other change, about the systemd service, it'll only affect
- users the are debugging kdump itself and it has no known regression
- potential.

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1800566

Title:
  Make reset_devices parameter default for kdump

Status in makedumpfile package in Ubuntu:
  In Progress
Status in makedumpfile source package in Trusty:
  Won't Fix
Status in makedumpfile source package in Xenial:
  In Progress
Status in makedumpfile source package in Bionic:
  In Progress
Status in makedumpfile source package in Cosmic:
  Won't Fix
Status in makedumpfile source package in Disco:
  In Progress
Status in makedumpfile source package in Eoan:
  In Progress
Status in makedumpfile source package in Focal:
  In Progress

Bug description:
  [Impact]

  * Kdump does not configure by default the crash kernel to perform a
  device reset by default, by passing the "reset_devices" parameter.

  * Kernel has the "reset_devices" parameter that drivers can opt-in,
  and perform special activity in case this parameter is parsed from
  command-line. For example, in kdump kernels it hints the drivers that
  they are booting from a non-healthy condition and needs to issue some
  form of reset to the adapter, like clearing DMA mapping in their
  firmware for example. Users currently (kernel v5.5-rc2) are: aacraid,
  hpsa, ipr, megaraid_sas, mpt3sas, smartpqi, xenbus.

  This should be enabled by default in the kdump config file to be added
  in the kdump kernel command-line for all versions.

  [Test Case]

  1) Deploy a Bionic VM e.g. with uvt-kvm
  2) Install the kdump-tools package
  3) Run `kdump-config test`and check for the 'reset_devices' parameter:

  $ kdump-config test
  ...
  kexec command to be used:
    /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" /var/lib/kdump/vmlinuz

  
  [Regression Potential]

  The regression potential is low, since it doesn't need any changes in
  makedumpfile code and we're only adding a parameter on the crash
  kernel command-line. The risks are related with bad behavior with the
  kernel when using "reset_devices", like if the driver has bugs in this
  path. It's considered safer to have the option (and this way prevent
  problems for booting a unhealthy kernel with potential stuck DMAs in
  the devices) than not having it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800566/+subscriptions



More information about the Ubuntu-sponsors mailing list