[Bug 1829823] Re: libvirt-bin: during shutdown libvirt-bin is stopped before libvirt-guests causing hang

Christian Ehrhardt  1829823 at bugs.launchpad.net
Thu Jul 18 10:33:23 UTC 2019


On Thu, Jul 18, 2019 at 12:30 PM Robie Basak <1829823 at bugs.launchpad.net> wrote:
>
> Am I correct in my understanding that it would be fine for the cloud
> archive tooling if we landed the upstart change in proposed but it does
> not go to updates, pending a future SRU or security update?

Yes coreycb confirmed that (I had the same question) somewhere above
already

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1829823

Title:
  libvirt-bin: during shutdown libvirt-bin is stopped before libvirt-
  guests causing hang

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive mitaka series:
  In Progress
Status in libvirt package in Ubuntu:
  Invalid
Status in libvirt source package in Xenial:
  Triaged

Bug description:
  [Impact]

   * libvirt-bin, in: libvirt-1.3.1-1ubuntu10.24~cloud0 in trusty mitaka uca
     and the parent package in xenial, libvirt-1.3.1-1ubuntu10.24 are
     affected.

   * When you shutdown a system in trusty which is running some kvm virtual
     machines, the libvirt-bin service is stopped before libvirt-guests.
     libvirt-guests tries to connect to the libvirt socket to send shutdown
     commands to the running vms, which cannot happen since libvirtd is not
     running.

   * On some machines, the qemu processes behind the virtual machines are
     not killed and are left behind as defunct processes, which can cause
     the system to hang on them not being terminated.

   * The bug is caused by the libvirt-bin upstart script [1] calling a
     non-existant script, /usr/lib/libvirt/libvirt-stop-guests [2]. This
     script used to exist in the upstart script itself in version
     1.2.2-0ubuntu13.1.27 [3], the version in the trusty archives. In
     liberty UCA, version 1.2.16-2ubuntu11.15.10.4~cloud0 [4], the script
     was separated out into /usr/lib/libvirt/libvirt-stop-guests [2].
     In the mitaka release, the libvirt-stop-guests script was removed and
     rewritten as /etc/init.d/libvirt-guests [5], but the script in [1] was
     never updated to point to it.

     [1] http://paste.ubuntu.com/p/GxxBczkCmk
     [2] http://paste.ubuntu.com/p/fKCDQh46vh
     [3] http://paste.ubuntu.com/p/QrKXqK2Bvz
     [4] http://paste.ubuntu.com/p/W8DgQwpYv3
     [5] http://paste.ubuntu.com/p/Z28Sp2fPd6

   * Since the upstart script was never updated to point to it, libvirt-bin
     stops without stopping libvirt-guests first. When libvirt-guests is
     stopped later, it cannot access the libvirt socket, cannot shut down
     the machines, causing the bug.

   * The fix is to change the upstart script to point to the new libvirt-
     guests script.

  [Test Case]

   * You can reproduce this in trusty with the mitaka UCA enabled.

   1) Enable mitaka UCA and install libvirt0 and libvirt-bin

   $ sudo add-apt-repository cloud-archive:mitaka
   $ sudo apt update
   $ sudo apt install libvirt0 libvirt-bin

   2) Install a virtual machine, either by using virt-install or
      virt-manager.
      I used a bionic VM.

   3) Enable debugging on libvirt-guests so you can see what is going on

   Modify /etc/init.d/libvirt-guests and add "-x" to the end of
  "!/bin/sh"

   4) With the vm running, shut down the system

   $ sudo shutdown -h now

   5) Check /var/log/upstart/libvirt-bin.log, on reboot. It will say
   "No such file or directory: /usr/lib/libvirt/libvirt-stop-guests"

   6) During that shutdown, you will see messages like:
   error: failed to connect to the hypervisor
   error: no valid connection
   error: Failed to connect socket to '/var/run/libvirt/libvirt-sock':
   No such file or directory

   What should happen:

   If you follow the same steps with the fixed package, when you look at
   /var/log/upstart/libvirt-bin.log, you will see output of libvirt-guests
   connecting to and shutting down the virtual machines which looks a little
   like this: https://paste.ubuntu.com/p/s4jyJX2y9F/

  [Regression Potential]

   * There is only one file modified, the upstart script for libvirt-bin.
     Currently this upstart file references a file which doesn't exist, so
     fixing it will restore the behavior in a way which aligns with exactly
     what took place in previous versions.

   * In xenial, all of this isn't used at all - see below at "Other
  Info"

   * This change only effects systems during shutdown while they still
     have virtual machines running, and do not effect starting and stopping
     services while the machine is running normally.

   * I believe the regression potential is low.

  [Other Info]

   * Xenial is not effected by this bug even though it ships the exact same
     packages. This is because xenial uses insserv to generate service
     dependency files ".depend.boot" ".depend.start" ".depend.stop" which
     parse the scripts in /etc/init.d/ and systemd respects the dependency
     ordering in these files.
     libvirt-guests reports a dependency on libvirt-bin in the script
     header, so systemd will always stop libvirt-guests before libvirt-bin,
     avoiding the problem seen in trusty.

   * The fix is needed in trusty mitaka UCA and xenial will likely need the
     SRU as part of the process.

   * We'd never have uploaded that change alone for xenial (being a no-op
     causing MBs to download and an upgrade. But we will bundle it with an
     actual change - so it can "ride along" to eventually help
     Trusty-Mitaka. Unfortunately there is no "current" Xenial SRU for 
     libvirt, hence we want to get it into xenial-proposed (which is enough 
     for the UCA tooling) but we do not want to release it to xenial-updates 
     until another another SRU comes by which we will generate with a -v 
     covering both then.

   * That way also if e.g. a security fix comes by it will be based on what 
     is in proposed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1829823/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list