[Bug 1986781] Re: [Ubuntu 22.04]cloud-init failed to complete after 10 minutes of waiting was shown during Installation via iDRAC Virtual Console

Mauricio Faria de Oliveira 1986781 at bugs.launchpad.net
Mon Jan 23 21:07:35 UTC 2023


For documentation purposes,

The critical chain of systemd units/time
before/after the fix, with a fake 'sleep 30'
as the command in casper-md5check.service.

Before:

root at ubuntu-server:/# systemd-analyze critical-chain cloud-final.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

cloud-final.service +492ms
`-multi-user.target @37.968s
  `-casper-md5check.service @7.954s +30.010s
    `-basic.target @7.947s
      `-sockets.target @7.946s
        `-snap.lxd.user-daemon.unix.socket @15.740s
          `-sysinit.target @7.892s
            `-cloud-init.service @5.854s +2.032s
              `-systemd-networkd-wait-online.service @5.835s +6ms
                `-systemd-networkd.service @5.713s +119ms
                  `-network-pre.target @5.709s
                    `-cloud-init-local.service @1.943s +3.765s
                      `-systemd-remount-fs.service @1.172s +130ms
                        `-systemd-journald.socket @988ms
                          `--.mount @942ms
                            `--.slice @941ms
                            
After:

root at ubuntu-server:/# systemd-analyze critical-chain cloud-final.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.

cloud-final.service +901ms
`-multi-user.target @21.831s
  `-ssh.service @21.712s +38ms
    `-basic.target @8.255s
      `-sockets.target @8.254s
        `-snap.lxd.user-daemon.unix.socket @17.452s
          `-sysinit.target @8.202s
            `-cloud-init.service @6.970s +1.225s
              `-systemd-networkd-wait-online.service @6.958s +9ms
                `-systemd-networkd.service @6.842s +113ms
                  `-network-pre.target @6.838s
                    `-cloud-init-local.service @1.938s +4.898s
                      `-systemd-remount-fs.service @1.160s +174ms
                        `-systemd-journald.socket @941ms
                          `-system.slice @894ms
                            `--.slice @894ms

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to casper in Ubuntu.
https://bugs.launchpad.net/bugs/1986781

Title:
  [Ubuntu 22.04]cloud-init failed to complete after 10 minutes of
  waiting was shown during Installation via iDRAC Virtual Console

Status in cloud-init:
  Invalid
Status in subiquity:
  Invalid
Status in casper package in Ubuntu:
  Fix Released
Status in casper source package in Focal:
  Triaged
Status in casper source package in Jammy:
  In Progress
Status in casper source package in Kinetic:
  Won't Fix

Bug description:
  [Impact]

   * Users that install Ubuntu Server through slow
     media (eg, virtual optical drive over network,
     which may be common on enterprise deployments)
     might hit the following subiquity startup error:

    'cloud-init failed to complete after 10 minutes of waiting'

   * (That in addition to 10 minutes of waiting themselves.)

   * This happens because casper-md5check.service is
     (slowly) verifying the integrity of install media,
     which blocks `multi-user.target`,
     which blocks `cloud-final.service`,
     which blocks `cloud-init status --wait`
     which is used in subiquity / `waiting on cloud-init`).

  [Fix]

   * The adopted solution (merged on lunar) is simply
     not to block `multi-user.target`, but rather run
     _after_ it.

  [Test Steps]

   For a synthetic reproducer of slowness of casper-md5check:

   * boot with `break=init` to break into initramfs-tools
     before exec() systemd.
   * chroot /root /bin/bash
   * edit /usr/lib/systemd/system/casper-md5check.service
   * prepend `strace --inject read:delay_enter=5s` to the
     command in `ExecStart`, to introduce a 5 secs delay
     to every read() syscall performed by casper-md5check.
   * exit twice (chroot, initramfs shell) to resume boot.

    See comment 37 for examples.

  [Regression Potential]

   * Functionality related to install media integrity check.

   * Users with corrupted install media might not realize
     this until later on; but this is rarely the case and
     even w/out the fix, there's a lot that runs _before_
     we even get to casper-md5check, so they may (still)
     see errors early anyway.

   * There's a cosmetic glitch in the proposed solution:
     the systemd line when casper-md5check finishes
     shows up on top of subiquity's menu (screenshot):

     "[ OK ] Finished casper-md5check Verify Live ISO checksums."

     Dan Bungert mentioned this is known and should be
     addressed in a future change to subiquity, and is
     not supposed to block the SRU for Jammy / 22.04.2.

  [Original Description]

  Description:

  On Dell EMC PowerEdge system when Install Ubuntu 22.04 via iDRAC
  Virtual Console, cloud-init failed to complete after 10 minutes of
  waiting.

  Steps to Reproduce:

  1. Login to iDRAC and Launch Virtual Console.
  2. Connect to Virtual Media and Map ubuntu 22.04 iso file using Map CD/DVD option.
  3. Try Installing Ubuntu server.
  4. "cloud-init" failed to complete after 10 minutes of waiting was shown during Installation.

  Expected Results :-

  Installation should be successful.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1986781/+subscriptions




More information about the foundations-bugs mailing list