[Bug 1732028] Re: transient boot fail with overlayroot [open-iscsi autopkg tests]

 Christian Ehrhardt  1732028 at bugs.launchpad.net
Thu Jul 19 11:00:26 UTC 2018


I could confirm that I can run the guest that way and it uses the
intended root=/dev/disk/by-path/ip-10.0.12.2:3260-iscsi-tgt-boot-test-
b7X8g2-lun-1-part1.

The only and unfortunate difference to the issue when run on LP infra stays, that it works all of the time :-/
Note: I ran most of them in background after verifying once how they behave. But wanted to make sure they complete reproducibly.
a - manual login, no user data case 3/3
b - user data collect data to disk and shut down 3/3
c - user data collect just shut down 3/3
d - user data collect data to disk and shut down, no tty (nohup CMD > log 2>&1) 3/3
e - non interactive like autopkgtest would run it 3/3
f - non interactive like autopkgtest would run it forcing KVM mode 3/3

Next I was parsing all the logs that the fails on LP accumulated recently.
I found this errors whichi is interesting:
  iscsistart: initiator reported error (15 - session exists)
I realized this is present on ALL logs that we gathered.
But after thinking I had a lead on this I realized that the good cases had those messages as well.
Also found the mentioned "ordering cycle on media-root\x2dro.mount/start"
In good as well as bad logs.
So neither of these "is it"

Essentially the boot around the iscsi root has these steps with some noise in between - looking for differences in good/bad cases. They start the same even with sharing a few errors that seem to be red herrings:
[...] (early boot)
all logs (id changes) - Logging into tgt-boot-test-o3PlsL 10.1.1.2:3260,1
all logs - mounted filesystem with ordered data
on 7/17 logs (also good) - Found ordering cycle ...
only all bad cases - Dependency failed for Local File Systems
only all bad cases - Timed out waiting for device (devices change)
only all bad cases - Started Emergency Shell

That matches this bug here.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to open-iscsi in Ubuntu.
https://bugs.launchpad.net/bugs/1732028

Title:
  timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests
  on LP Infra]

Status in open-iscsi package in Ubuntu:
  New
Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  This issue keeps cropping up.  It shows itself in open-iscsi autopkg tests.
  I think it might just be "really slow system".  It seems the timeout is only
   1 minute 30 seconds for the disk to appear, and in a happy run you
  might see something very close:

  [K[   [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for dev-disk…-UEFI.device (1min 29s / 1min 32s)
  [K[  [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for dev-disk…-UEFI.device (1min 30s / 1min 32s)
  [K[[0;32m  OK  [0m] Found device VIRTUAL-DISK UEFI.
           Mounting /boot/efi...

  ---
  There is information on the open-iscsi tests at [1].
    [1] https://git.launchpad.net/~usd-import-team/ubuntu/+source/open-iscsi/tree/debian/tests/README-boot-test.md

  The tests set up an iscsi target and boot a kvm guest off that read-
  only root with overlayroot.

  # get open-iscsi source
  $ apt-get source open-iscsi
  $ cd open-iscsi-2.0.874/

  $ sudo apt-get install -qy simplestreams tgt qemu-system-x86 \
       cloud-image-utils distro-info
  $ cd open-iscsi-2.0.874/

  ## we're now mostly following debian/tests/README-boot-test.md
  # download the image and get kernel/initrd
  $ PATH=$PWD/debian/tests:$PATH
  $ get-image bionic.d bionic 
  $ sudo `which patch-image` \
      --kernel=bionic.d/kernel --initrd=bionic.d/initrd bionic.d/disk.img

  $ tgt-boot-test -v bionic.d/disk.img bionic.d/kernel bionic.d/initrd
  ....

  Success is being able to log in with 'ubuntu' and 'passw0rd'.
  Failure as seen in the log is dropping into an emergency shell.

  Once inside (this was successful) you'll see a mostly sane system.
  Some things to note:
  a.) tgt-boot-test boots without kvm enabled.  This is because using
   kvm with qemu in nested virt would cause system lockups. Its slower
  but more reliable to go wtihout.
  b.) under bug 1723183 I  made overlayroot comment out the root filesystem
  from the rendered /etc/fstab.  That was because systemd got confused and
  assumed that /media/root-ro had to be on top of /.
  c.) you can enable or disable kvm by setting _USE_KVM=0 or _USE_KVM=1
     in your environment.

  $ grep -v "^# " /etc/fstab
  #
  #
  #LABEL=cloudimg-rootfs /media/root-ro/ ext4 ro,defaults,noauto 0 0
  /media/root-ro/ / overlay lowerdir=/media/root-ro/,upperdir=/media/root-rw/overl
  ay/,workdir=/media/root-rw/overlay-workdir/_ 0 0
  LABEL=UEFI /boot/efi vfat defaults 0 0 # overlayroot:fs-unsupported

  $ sudo blkid
  /dev/sda1: LABEL="cloudimg-rootfs" UUID="7b1980bd-9102-4356-8df0-ec7a0c062411" TYPE="ext4" PARTUUID="c0b5ace0-4703-4667-babb-3d38137cab88"
  /dev/sda15: LABEL="UEFI" UUID="B177-3CC9" TYPE="vfat" PARTUUID="0ab0b9fd-2c28-4724-857a-1559f0cf76ea"
  /dev/sda14: PARTUUID="221662d6-cab0-4290-ba1c-e72acf2bf193"

  $ cat /run/systemd/generator/local-fs.target.requires/boot-efi.mount
  # Automatically generated by systemd-fstab-generator

  [Unit]
  SourcePath=/etc/fstab
  Documentation=man:fstab(5) man:systemd-fstab-generator(8)
  Before=local-fs.target

  [Mount]
  Where=/boot/efi
  What=/dev/disk/by-label/UEFI
  Type=vfat

  Related bugs:
   * bug 1680197: Zesty deployments failing sporadically
   * bug 1723183: transient systemd ordering issue when using overlayroot

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: systemd 234-2ubuntu12
  ProcVersionSignature: User Name 4.13.0-16.19-generic 4.13.4
  Uname: Linux 4.13.0-16-generic x86_64
  ApportVersion: 2.20.7-0ubuntu4
  Architecture: amd64
  Date: Mon Nov 13 21:06:36 2017
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
  ProcEnviron:
   TERM=vt220
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: nomodeset iscsi_initiator=maas-enlist iscsi_target_name=tgt-boot-test-7xuhwl iscsi_target_ip=10.0.12.2 iscsi_target_port=3260 iscsi_initiator=maas-enlist ip=::::maas-enlist:BOOTIF ro net.ifnames=0 BOOTIF_DEFAULT=eth0 root=/dev/disk/by-path/ip-10.0.12.2:3260-iscsi-tgt-boot-test-7xuhwl-lun-1-part1 overlayroot=tmpfs console=ttyS0 ds=nocloud-net;seedfrom=http://10.0.12.2:32600/
  SourcePackage: systemd
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/01/2014
  dmi.bios.vendor: SeaBIOS
  dmi.bios.version: 1.10.2-1ubuntu1
  dmi.chassis.type: 1
  dmi.chassis.vendor: QEMU
  dmi.chassis.version: pc-i440fx-artful
  dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-artful:cvnQEMU:ct1:cvrpc-i440fx-artful:
  dmi.product.name: Standard PC (i440FX + PIIX, 1996)
  dmi.product.version: pc-i440fx-artful
  dmi.sys.vendor: QEMU

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions



More information about the foundations-bugs mailing list