[Bug 1987387] Re: [UBUNTU 20.04] zgetdump can not handle multivolume dumps
Frank Heimes
1987387 at bugs.launchpad.net
Mon Jan 23 10:47:45 UTC 2023
Hello Thorsten,
the Ubuntu SRU policy is that we have to patch latest releases first and than go down to the oldest release that is affected, and that just took some time (esp. during year end break).
But it's almost there - and you can even use it today - from proposed.
s390-tools version 2.12.0-0ubuntu3.7:
https://launchpad.net/ubuntu/+source/s390-tools/2.12.0-0ubuntu3.7
is the package that incl. the zgetdump fix (as well as support for the secure boot trailer).
You can easily use it as follows:
sudo add-apt-repository -y "deb http://us.ports.ubuntu.com/ubuntu-ports/ $(lsb_release -sc)-proposed main universe"
sudo apt update
sudo apt install --yes s390-tools
(It will soon be promoted from -proposed to -updates.)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to s390-tools-signed in Ubuntu.
https://bugs.launchpad.net/bugs/1987387
Title:
[UBUNTU 20.04] zgetdump can not handle multivolume dumps
Status in Ubuntu on IBM z Systems:
Fix Committed
Status in s390-tools package in Ubuntu:
Fix Released
Status in s390-tools-signed package in Ubuntu:
Fix Released
Status in s390-tools source package in Focal:
Fix Committed
Status in s390-tools-signed source package in Focal:
Fix Committed
Bug description:
SRU Justification:
==================
[Impact]
* The zgetdump tool (as part of the current s390-tools version in focal)
is not able to handle multi-volume dumps (DASD disk) dumps.
* While this is rarely needed, it is extremely annoying if one is
in usually urgent need to use it and it does not work.
* On s390x systems multi-volume (DASD disk) dumps are pretty common,
due to usually costly and therefore limited disk resources,
DASD disks are usually relatively small.
[Fix]
* d55b787 d55b787d05eb9bd70f93c36cf859b66b2ad02038 "zgetdump: Fix
device node determination via sysfs"
[Test Plan]
* Have an IBM zSystems LPAR with Ubuntu 20.04 / focal (latest).
* Have two (or more) additional DASD disks reserved for storing dumps,
assigned to the system and enabled:
sudo chzdev -e 0002, 0003
(Assuming 0001 has Ubuntu 20.04 installed - all 3390 disks).
Get the block device names using 'lsdasd' - assuming here dasdb, dasdc
* If needed low-level format these disks:
sudo dasdfmt -y -b 4096 /dev/dasdb
sudo dasdfmt -y -b 4096 /dev/dasdc
* Create one single partition per disk:
sudo fdasd -a /dev/dasdb
sudo fdasd -a /dev/dasdc
* Create an mvdump.conf file that points to the above disks
sudo vi /etc/mvdump.conf
...
cat /etc/mvdump.conf
/dev/dasdb1
/dev/dasdc1
* Re-write the zipl boot record like this:
sudo zipl -n -M /etc/mvdump.conf
* Now ipl the system from the first dump DASD: 0002
and initiate the DASD dump by:
1. Stop all CPUs.
2. Store status on the IPL CPU.
3. IPL the dump tool on the IPL CPU.
* Wait until the dump is completed and re-ipl the Ubuntu 20.04 again
(0001).
* Without this fix one will see a message like this:
sudo zgetdump -i /dev/dasdb
zgetdump: Could not open "/sys/bus/ccw/devices/0.0.0002/dasdb/dev" (No such file or directory)
* With the fix one will see a message like this:
zgetdump -i /dev/dasdb
General dump info:
Dump format........: s390mv_ext
Version............: 1
Dump created.......:
...
Dump device info:
Volume 0: 0.0.0002 (online/active)
Volume 1: 0.0.0003 (online/valid)
* For more in-depth details see the
'Linux on System z. Using the Dump Tools.' documentation:
https://www.ibm.com/docs/en/linux-on-systems?topic=dump-hmc-se-example
[Where problems could occur]
* A new function got introduced to 'check whether a device with a given
busid is online'.
Issues could occur in case this function is broken and
checks for a wrong busid, has a wrong path
or handled the status wrongly.
* The kind of 'little' refactoring of that patch may lead to
further unexpected issues (that can largely identified by a test build).
* The additional use of libutil functions may cause issues
in case of an outdated libutil that does not offer all needed functions.
(Testable with a test build.)
* Erroneous code may even break single volume dumps
[Other Info]
* This code is known to work with hirsute and newer Ubuntu releases,
esp. jammy (respectively their s390-tools versions).
* The upstream code can be cleanly cherry-picked,
hence is applied as-is.
* An updated s390-tools version 2.12.0-0ubuntu3.7 with a
patched zgetdump tool was build and made available via this PPA:
https://launchpad.net/~fheimes/+archive/ubuntu/lp1987387
and was already successfully tested!
__________
== Comment: #0 - Thorsten Diehl <thorsten.diehl at de.ibm.com> - 2022-08-16 12:40:46 ==
I installed Ubuntu 20.04.4 LTS on IBM z14, enabled two DASDs, created one partition on each DASD and created an mvdump.conf file like this:
/dev/dasdc1
/dev/dasdd1
I wrote the boot record by zipl -n -M mvdump.conf and IPLed the system from dasdc devno.
The dump completed succesfully.
Then I tried to get this dump via zgetdump on the restarted Ubuntu 20.04.4 system (zgetdump -v reports version 2.12.0-build-20220506), I got the following error:
root at m8330032:~# zgetdump -i /dev/dasdc
zgetdump: Could not open "/sys/bus/ccw/devices/0.0.9405/dasdc/dev" (No such file or directory)
root at m8330032:~# zgetdump -i /dev/dasdc1
zgetdump: Could not open "/sys/bus/ccw/devices/0.0.9405/dasdc/dev" (No such file or directory)
root at m8330032:~# zgetdump -i /dev/dasdd
zgetdump: Could not open "/sys/bus/ccw/devices/0.0.9405/dasdc/dev" (No such file or directory)
root at m8330032:~# zgetdump -i /dev/dasdd1
zgetdump: No valid dump found on "/dev/dasdd1"
root at m8330032:~#
However, If I'm doing the same zgetdump on another system (e.g. with newer s390-tools version), I get the expected result
m83lp32:~ # zgetdump -i /dev/dasdc
General dump info:
Dump format........: s390mv_ext
Version............: 1
Dump created.......: Tue, 16 Aug 2022 18:31:57 +0200
Dump ended.........: Tue, 16 Aug 2022 18:32:02 +0200
Dump CPU ID........: ff1fa1e739068000
UTS node name......: m8330032.lnxne.boe
UTS kernel release.: 5.4.0-124-generic
UTS kernel version.: #140-Ubuntu SMP Thu Aug 4 02:23:07 UTC 2022
Build arch.........: s390x (64 bit)
System arch........: s390x (64 bit)
CPU count (online).: 16
CPU count (real)...: 16
Dump memory range..: 4096 MB
Real memory range..: 4096 MB
Dump file size.....: 849 MB
Memory map:
0000000000000000 - 00000000ffffffff (4096 MB)
Dump device info:
Volume 0: 0.0.9405 (online/active)
Volume 1: 0.0.9406 (online/valid)
m83lp32:~ #
The error is easily reproducible.
Please update zgetdump to a newer version to solve this RAS problem.
With Jammy (22.04.1; s390-tools version 2.20.0-build-20220623) this
problem does not occur.
== Comment: #3 - Jan Hoeppner <Jan.Hoeppner at de.ibm.com> - 2022-08-22 01:43:45 ==
There were several issues fixed in s390-tools v2.15.1 in regards to multivolume dumps: https://github.com/ibm-s390-linux/s390-tools/releases/tag/v2.15.1
Especially the following upstream commit for zgetdump:
https://github.com/ibm-s390-linux/s390-tools/commit/d55b787d05eb9bd70f93c36cf859b66b2ad02038
---
External link: https://warthogs.atlassian.net/browse/PEI-30
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1987387/+subscriptions
More information about the foundations-bugs
mailing list