[Bug 1986781] Re: [Ubuntu 22.04]cloud-init failed to complete after 10 minutes of waiting was shown during Installation via iDRAC Virtual Console
Mauricio Faria de Oliveira
1986781 at bugs.launchpad.net
Mon Jan 23 20:59:45 UTC 2023
Reproducer/Verification for SRU to Jammy,
based on strace delay injection on read().
Uploading to jammy.
Attaching debdiff for reference.
...
Launch a VM with the Jammy daily live server:
$ wget https://cdimage.ubuntu.com/ubuntu-server/jammy/daily-
live/current/jammy-live-server-amd64.iso
$ ISO=jammy-live-server-amd64.iso
$ VM=casper-jammy
$ virt-install --name $VM --cdrom $ISO --vcpus 2 --memory 2048 --disk none --osinfo ubuntu-stable-latest
Press e to edit, append "break=init console=ttyS0", press ctrl-x, close
window.
Open the serial console and chroot:
$ virsh console $VM
...
(initramfs) chroot /root /bin/bash
Test strace delay injection:
# time strace --trace read cat /dev/null
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\3206\2\0\0\0\0\0"..., 832) = 832
read(3, "", 131072) = 0
+++ exited with 0 +++
real 0m0.041s
user 0m0.009s
sys 0m0.030s
# time strace --trace read --inject read:delay_enter=5s cat /dev/null
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\3206\2\0\0\0\0\0"..., 832) = 832 (DELAYED)
read(3, "", 131072) = 0 (DELAYED)
+++ exited with 0 +++
real 0m10.041s
user 0m0.010s
sys 0m0.033s
Modify the casper-md5check service:
# sed -i '/^ExecStart=/ s,=,=/usr/bin/strace --inject
read:delay_enter=60s ,' /usr/lib/systemd/system/casper-md5check.service
# cat /usr/lib/systemd/system/casper-md5check.service
[Unit]
Description=casper-md5check Verify Live ISO checksums
[Service]
Type=oneshot
ExecStart=/usr/bin/strace --inject read:delay_enter=60s /usr/lib/casper/casper-md5check /cdrom /cdrom/md5sum.txt
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
Exit and wait; the issue happens:
# exit
(initramfs) exit
...
Ubuntu 22.04.1 ubuntu-server ttyS0
connecting...
waiting for cloud-init... /
<10 minutes later>
================================================================================
Serial [ Help ]
================================================================================
As the installer is running on a serial console, it has started in basic
mode, using only the ASCII character set and black and white colours.
┌────────────────────────────────────────────────────────────────────────┐
│ │
│ cloud-init failed to complete after 10 minutes of waiting. This │
│ suggests a bug, which we would appreciate help understanding. If │
│ you could file a bug at │
│ https://bugs.launchpad.net/subiquity/+filebug and attach the │
│ contents of /var/log, it would be most appreciated. │
│ │
│ [ Switch to a shell ] │
│ [ Close ] │
│ │
└────────────────────────────────────────────────────────────────────────┘
[ Continue in rich mode > ]
[ Continue in basic mode > ]
...
Similarly, repeat, this time with test packages from ppa:mfo/lp1986781:
...
Open the serial console and chroot:
$ virsh console $VM
...
(initramfs) chroot /root /bin/bash
Install the test package:
# dhclient enp1s0
# wget https://launchpad.net/~mfo/+archive/ubuntu/lp1986781/+build/25512466/+files/casper_1.470.2_amd64.deb
# dpkg -i casper_*.deb
Modify the casper-md5check service:
# sed -i '/^ExecStart=/ s,=,=/usr/bin/strace --inject
read:delay_enter=60s ,' /usr/lib/systemd/system/casper-md5check.service
# cat /usr/lib/systemd/system/casper-md5check.service
Description=casper-md5check Verify Live ISO checksums
After=multi-user.target
[Service]
Type=oneshot
ExecStart=/usr/bin/strace --inject read:delay_enter=60s /usr/lib/casper/casper-md5check /cdrom /cdrom/md5sum.txt
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
Exit and wait; the issue does _not_ happen anymore:
# exit
(initramfs) exit
...
Ubuntu 22.04.1 ubuntu-server ttyS0
connecting...
waiting for cloud-init... /
<some seconds later>
================================================================================
Serial [ Help ]
================================================================================
As the installer is running on a serial console, it has started in basic
mode, using only the ASCII character set and black and white colours.
If you are connecting from a terminal emulator such as gnome-terminal that
supports unicode and rich colours you can switch to "rich mode" which uses
unicode, colours and supports many languages.
You can also connect to the installer over the network via SSH, which will
allow use of rich mode.
[ Continue in rich mode > ]
[ Continue in basic mode > ]
[ View SSH instructions ]
Checking the casper-md5sum service is still running:
Help > Enter shell.
# systemctl status casper-md5check.service --no-pager | grep 'Active:'
Active: activating (start) since Mon 2023-01-23 18:56:50 UTC; 3min 26s ago
And it should not be a problem, as its start timeout is not limited.
# systemctl show casper-md5check.service | grep -i timeout
TimeoutStartUSec=infinity
TimeoutStopUSec=1min 30s
TimeoutAbortUSec=1min 30s
TimeoutStartFailureMode=terminate
TimeoutStopFailureMode=terminate
TimeoutCleanUSec=infinity
JobTimeoutUSec=infinity
JobRunningTimeoutUSec=infinity
JobTimeoutAction=none
Quit, cleanup the VM.
Press ctrl-]
$ virsh destroy $VM
$ virsh undefine $VM
** Description changed:
+ [Impact]
+
+ * Users that install Ubuntu Server through slow
+ media (eg, virtual optical drive over network,
+ which may be common on enterprise deployments)
+ might hit the following subiquity startup error:
+
+ 'cloud-init failed to complete after 10 minutes of waiting'
+
+ * (That in addition to 10 minutes of waiting themselves.)
+
+ * This happens because casper-md5check.service is
+ (slowly) verifying the integrity of install media,
+ which blocks `multi-user.target`,
+ which blocks `cloud-final.service`,
+ which blocks `cloud-init status --wait`
+ which is used in subiquity / `waiting on cloud-init`).
+
+ [Fix]
+
+ * The adopted solution (merged on lunar) is simply
+ not to block `multi-user.target`, but rather run
+ _after_ it.
+
+ [Test Steps]
+
+ For a synthetic reproducer of slowness of casper-md5check:
+
+ * boot with `break=init` to break into initramfs-tools
+ before exec() systemd.
+ * chroot /root /bin/bash
+ * edit /usr/lib/systemd/system/casper-md5check.service
+ * prepend `strace --inject read:delay_enter=5s` to the
+ command in `ExecStart`, to introduce a 5 secs delay
+ to every read() syscall performed by casper-md5check.
+ * exit twice (chroot, initramfs shell) to resume boot.
+
+ See comment 37 for examples.
+
+ [Other Info]
+
+ * There's a small glitch in the proposed solution:
+ the systemd line when casper-md5check finishes
+ shows up on top of subiquity's menu (screenshot):
+
+ "[ OK ] Finished casper-md5check Verify Live ISO checksums."
+
+ Dan Bungert mentioned this is known and should be
+ addressed in a future change to subiquity, and is
+ not supposed to block the SRU for Jammy / 22.04.2.
+
+
+ [Original Description]
+
Description:
On Dell EMC PowerEdge system when Install Ubuntu 22.04 via iDRAC Virtual
Console, cloud-init failed to complete after 10 minutes of waiting.
Steps to Reproduce:
1. Login to iDRAC and Launch Virtual Console.
2. Connect to Virtual Media and Map ubuntu 22.04 iso file using Map CD/DVD option.
3. Try Installing Ubuntu server.
4. "cloud-init" failed to complete after 10 minutes of waiting was shown during Installation.
Expected Results :-
Installation should be successful.
** Description changed:
[Impact]
- * Users that install Ubuntu Server through slow
- media (eg, virtual optical drive over network,
- which may be common on enterprise deployments)
- might hit the following subiquity startup error:
+ * Users that install Ubuntu Server through slow
+ media (eg, virtual optical drive over network,
+ which may be common on enterprise deployments)
+ might hit the following subiquity startup error:
- 'cloud-init failed to complete after 10 minutes of waiting'
+ 'cloud-init failed to complete after 10 minutes of waiting'
- * (That in addition to 10 minutes of waiting themselves.)
+ * (That in addition to 10 minutes of waiting themselves.)
- * This happens because casper-md5check.service is
- (slowly) verifying the integrity of install media,
- which blocks `multi-user.target`,
- which blocks `cloud-final.service`,
- which blocks `cloud-init status --wait`
- which is used in subiquity / `waiting on cloud-init`).
+ * This happens because casper-md5check.service is
+ (slowly) verifying the integrity of install media,
+ which blocks `multi-user.target`,
+ which blocks `cloud-final.service`,
+ which blocks `cloud-init status --wait`
+ which is used in subiquity / `waiting on cloud-init`).
[Fix]
- * The adopted solution (merged on lunar) is simply
- not to block `multi-user.target`, but rather run
- _after_ it.
-
+ * The adopted solution (merged on lunar) is simply
+ not to block `multi-user.target`, but rather run
+ _after_ it.
+
[Test Steps]
- For a synthetic reproducer of slowness of casper-md5check:
-
- * boot with `break=init` to break into initramfs-tools
- before exec() systemd.
- * chroot /root /bin/bash
- * edit /usr/lib/systemd/system/casper-md5check.service
- * prepend `strace --inject read:delay_enter=5s` to the
- command in `ExecStart`, to introduce a 5 secs delay
- to every read() syscall performed by casper-md5check.
- * exit twice (chroot, initramfs shell) to resume boot.
-
- See comment 37 for examples.
+ For a synthetic reproducer of slowness of casper-md5check:
- [Other Info]
+ * boot with `break=init` to break into initramfs-tools
+ before exec() systemd.
+ * chroot /root /bin/bash
+ * edit /usr/lib/systemd/system/casper-md5check.service
+ * prepend `strace --inject read:delay_enter=5s` to the
+ command in `ExecStart`, to introduce a 5 secs delay
+ to every read() syscall performed by casper-md5check.
+ * exit twice (chroot, initramfs shell) to resume boot.
- * There's a small glitch in the proposed solution:
- the systemd line when casper-md5check finishes
- shows up on top of subiquity's menu (screenshot):
+ See comment 37 for examples.
- "[ OK ] Finished casper-md5check Verify Live ISO checksums."
+ [Regression Potential]
- Dan Bungert mentioned this is known and should be
- addressed in a future change to subiquity, and is
- not supposed to block the SRU for Jammy / 22.04.2.
+ * Functionality related to install media integrity check.
+ * Users with corrupted install media might not realize
+ this until later on; but this is rarely the case and
+ even w/out the fix, there's a lot that runs _before_
+ we even get to casper-md5check, so they may (still)
+ see errors early anyway.
+
+ * There's a cosmetic glitch in the proposed solution:
+ the systemd line when casper-md5check finishes
+ shows up on top of subiquity's menu (screenshot):
+
+ "[ OK ] Finished casper-md5check Verify Live ISO checksums."
+
+ Dan Bungert mentioned this is known and should be
+ addressed in a future change to subiquity, and is
+ not supposed to block the SRU for Jammy / 22.04.2.
[Original Description]
Description:
On Dell EMC PowerEdge system when Install Ubuntu 22.04 via iDRAC Virtual
Console, cloud-init failed to complete after 10 minutes of waiting.
Steps to Reproduce:
1. Login to iDRAC and Launch Virtual Console.
2. Connect to Virtual Media and Map ubuntu 22.04 iso file using Map CD/DVD option.
3. Try Installing Ubuntu server.
4. "cloud-init" failed to complete after 10 minutes of waiting was shown during Installation.
Expected Results :-
Installation should be successful.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to casper in Ubuntu.
https://bugs.launchpad.net/bugs/1986781
Title:
[Ubuntu 22.04]cloud-init failed to complete after 10 minutes of
waiting was shown during Installation via iDRAC Virtual Console
Status in cloud-init:
Invalid
Status in subiquity:
Invalid
Status in casper package in Ubuntu:
Fix Released
Status in casper source package in Focal:
Triaged
Status in casper source package in Jammy:
In Progress
Status in casper source package in Kinetic:
Won't Fix
Bug description:
[Impact]
* Users that install Ubuntu Server through slow
media (eg, virtual optical drive over network,
which may be common on enterprise deployments)
might hit the following subiquity startup error:
'cloud-init failed to complete after 10 minutes of waiting'
* (That in addition to 10 minutes of waiting themselves.)
* This happens because casper-md5check.service is
(slowly) verifying the integrity of install media,
which blocks `multi-user.target`,
which blocks `cloud-final.service`,
which blocks `cloud-init status --wait`
which is used in subiquity / `waiting on cloud-init`).
[Fix]
* The adopted solution (merged on lunar) is simply
not to block `multi-user.target`, but rather run
_after_ it.
[Test Steps]
For a synthetic reproducer of slowness of casper-md5check:
* boot with `break=init` to break into initramfs-tools
before exec() systemd.
* chroot /root /bin/bash
* edit /usr/lib/systemd/system/casper-md5check.service
* prepend `strace --inject read:delay_enter=5s` to the
command in `ExecStart`, to introduce a 5 secs delay
to every read() syscall performed by casper-md5check.
* exit twice (chroot, initramfs shell) to resume boot.
See comment 37 for examples.
[Regression Potential]
* Functionality related to install media integrity check.
* Users with corrupted install media might not realize
this until later on; but this is rarely the case and
even w/out the fix, there's a lot that runs _before_
we even get to casper-md5check, so they may (still)
see errors early anyway.
* There's a cosmetic glitch in the proposed solution:
the systemd line when casper-md5check finishes
shows up on top of subiquity's menu (screenshot):
"[ OK ] Finished casper-md5check Verify Live ISO checksums."
Dan Bungert mentioned this is known and should be
addressed in a future change to subiquity, and is
not supposed to block the SRU for Jammy / 22.04.2.
[Original Description]
Description:
On Dell EMC PowerEdge system when Install Ubuntu 22.04 via iDRAC
Virtual Console, cloud-init failed to complete after 10 minutes of
waiting.
Steps to Reproduce:
1. Login to iDRAC and Launch Virtual Console.
2. Connect to Virtual Media and Map ubuntu 22.04 iso file using Map CD/DVD option.
3. Try Installing Ubuntu server.
4. "cloud-init" failed to complete after 10 minutes of waiting was shown during Installation.
Expected Results :-
Installation should be successful.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1986781/+subscriptions
More information about the foundations-bugs
mailing list