[Bug 1031065] Re: cloud-init-nonet runs 'start networking' explicitly

Scott Moser smoser at ubuntu.com
Thu Jan 31 19:26:41 UTC 2013


** Description changed:

+ == Begin SRU Information ==
+ [Impact]
+ Cloud-init in 12.04 has an upstart job named 'cloud-init-nonet' that
+ calls 'start networking' explicitly.   This was done to fix a boot
+ deadlock (bug 800824), but it was not the proper fix.
+ 
+ A much more correct fix is now possible because of improvements
+ that have been made in mountall (bug 643289) and have been brought
+ back to 12.04.
+ 
+ calling 'start networking' from cloud-init-nonet could cause issues
+ because other upstart 'start on' conditions might not be met at this
+ point in boot.
+ 
+ The fix here is the same as is now applied in quantal and raring.
+ It more correctly addresses the root issue, that network-device-added
+ events were not being emitted in a container.  There is now a job named
+ cloud-init-container that will emit 'network-device-added' events if
+ it is inside a container *and* its sanity checks show that the given
+ device has not already been brought up.
+ 
+ [Test Case]
+ You can demonstrate why this change was necessary, and that the
+ provided change fixes the issue by doing the following:
+ 
+ ## create a 'source' (pristine) root ##
+ sudo lxc-create -t ubuntu-cloud --name source-precise-amd64 -- \
+    --release precise --arch amd64 --stream daily
+ 
+ # set up 2 copies of the pristine root
+ # * 'nostart' just has cloud-init's call to 'start-networking' disabled
+ # * 'patched' contains the full upgraded cloud-init
+ sudo lxc-clone -o source-precise-amd64 -n nostart
+ f="/var/lib/lxc/nostart/etc/init/cloud-init-nonet.conf"
+ if [ ! -e "$f.dist" ]; then
+    # disable 'start networking' in the cloud-init-nonet job
+    sudo cp "$f" "$f.dist"
+    sudo sed -i 's,^\([ ]\+start networking.*\),#\1,' "$f"
+ fi
+ 
+ sudo lxc-clone -o source-precise-amd64 -n patched
+ deb="cloud-init_0.6.3-0ubuntu1.5~ppa1_all.deb"
+ rpath=/var/lib/lxc/patched/rootfs
+ sudo cp $deb "/var/lib/lxc/patched/rootfs/tmp"
+ sudo LANG=C chroot "/var/lib/lxc/patched/rootfs" dpkg -i "/tmp/$deb"
+ 
+ ## Now, start both.  the 'nostart' root will hang on
+ ## cloud-init waiting for networking to come up.
+ ## the 'patched' will come all the way up quickly.
+ ## You can stop them with 'sudo lxc-stop -n <name>'
+ sudo lxc-start -n patched -- /sbin/init --verbose
+ sudo lxc-start -n nostart -- /sbin/init --verbose
+ 
+ [Regression Potential]
+ Regressions would be likely to occur in one of 2 places:
+ a.) inside a container, where cloud-init-container caused a
+     problem by emitting its network-device-added events
+ 
+     Here, the problem would be very limited to lxc containers
+     that have cloud-init inside them.  This is likely very small.
+ 
+ b.) outside a container, where 'start networking' was previously
+     "fixing" a boot deadlock that could have occurred.
+     Here, the changes are basically making precise work like 12.10
+     and 13.04, so hopefully issues would have been shaken out there.
+ == End SRU Information ==
+ 
+ 
  In development of 'overlayroot' package, I was mounting / as rw in the initramfs.
  This was causing a different order of execution of mounts, and as a result a different order of networking resolvconf and networking bringup.
  
  It exposed a bug in resolvconf, which was being called on boot in this order:
  ==== Mon Jul 30 19:08:58 UTC 2012 /sbin/resolvconf -a lo.inet ====
  ==== Mon Jul 30 19:08:59 UTC 2012 /sbin/dhclient-script  ====
  ==== Mon Jul 30 19:08:59 UTC 2012 /sbin/resolvconf -a eth0.dhclient ====
  ==== Mon Jul 30 19:08:59 UTC 2012 /sbin/resolvconf -a eth0.inet ====
  ==== Mon Jul 30 19:08:59 UTC 2012 /sbin/resolvconf --enable-updates ====
  ==== Mon Jul 30 19:08:59 UTC 2012 /etc/resolvconf/update.d/libc -u ====
  
  The normal order is for resolvconf --enable-updates to be called (from /etc/init/resolvconf.conf) before anything else. As a result, I was seeing errors like:
    resolvconf: Error: /run/resolvconf/interface either does not exist or is not a directory
  
  This may be exposing a more grave issue, in that I believe the reason
  for dhclient coming up before resolvconf.conf started was that it was
  being run as a result of /etc/init/network-interface.conf. I don't
  immediately see how that is guarnateed to have /run ounted at all.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 12.10
  Package: resolvconf 1.67ubuntu1
  ProcVersionSignature: User Name 3.5.0-6.6-generic 3.5.0
  Uname: Linux 3.5.0-6-generic x86_64
  Architecture: amd64
  Date: Mon Jul 30 20:07:04 2012
  PackageArchitecture: all
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: resolvconf
  UpgradeStatus: No upgrade log present (probably fresh install)
  
  Related Bugs:
   * bug 800824: cloud-init-nonet times out in lxc
   * bug 925122: container's udevadm trigger --add affects the host
-  * bug 643289: idmapd does not starts to work after system reboot
+  * bug 643289: idmapd does not starts to work after system reboot

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to cloud-init in Ubuntu.
https://bugs.launchpad.net/bugs/1031065

Title:
  cloud-init-nonet runs 'start networking' explicitly

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1031065/+subscriptions



More information about the Ubuntu-server-bugs mailing list