[Bug 1731051] Re: (arm64) VM fails to properly reboot

Sean Feole sean.feole at canonical.com
Fri Nov 17 18:04:56 UTC 2017


ubuntu at lundmark:~$ dpkg -l | grep qemu
ii  ipxe-qemu                                  1.0.0+git-20161027.b991c67+really20150424.a25a16d-1ubuntu2 all          PXE boot firmware - ROM images for qemu
ii  qemu-block-extra:arm64                     1:2.10+dfsg-0ubuntu3.1                                     arm64        extra block backend modules for qemu-system and qemu-utils
ii  qemu-efi                                   0~20170911.5dfba97c-1ubuntu0.1                             all          transitional dummy package
ii  qemu-efi-aarch64                           0~20170911.5dfba97c-1ubuntu0.1                             all          UEFI firmware for 64-bit ARM virtual machines
ii  qemu-guest-agent                           1:2.10+dfsg-0ubuntu3.1                                     arm64        Guest-side qemu-system agent
ii  qemu-kvm                                   1:2.10+dfsg-0ubuntu3.1                                     arm64        QEMU Full virtualization
ii  qemu-system-aarch64                        1:2.10+dfsg-0ubuntu3.1                                     arm64        QEMU full system emulation binaries (aarch64)
ii  qemu-system-arm                            1:2.10+dfsg-0ubuntu3.1                                     arm64        QEMU full system emulation binaries (arm)
ii  qemu-system-common                         1:2.10+dfsg-0ubuntu3.1                                     arm64        QEMU full system emulation binaries (common files)
ii  qemu-utils                                 1:2.10+dfsg-0ubuntu3.1                                     arm64        QEMU utilities
ubuntu at lundmark:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 17.10
Release:	17.10
Codename:	artful
ubuntu at lundmark:~$ 
ubuntu at lundmark:~$ 
ubuntu at lundmark:~$ sudo virsh list --all
 Id    Name                           State
----------------------------------------------------
 7     ubuntu1710111                  running
 -     ubuntu1710                     shut off

ubuntu at lundmark:~$ sudo virsh destroy ubuntu1710111
Domain ubuntu1710111 destroyed

ubuntu at lundmark:~$ sudo virsh start ubuntu1710111
Domain ubuntu1710111 started

ubuntu at lundmark:~$ sudo virsh reboot ubuntu1710111 --mode acpi
Domain ubuntu1710111 is being rebooted


VM Console 
<SNIP>

[  OK  ] Started Set console scheme.
[  OK  ] Created slice system-getty.slice.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started LSB: QEMU Guest Agent startup script.
[  OK  ] Started LSB: automatic crash report generation.

Ubuntu 17.10 ubuntu ttyAMA0

ubuntu login: [  OK  ] Closed Load/Save RF Kill Switch Status /dev/rfkill Watch.
[  OK  ] Stopped target Graphical Interface.
[  OK  ] Stopped target Multi-User System.
[  OK  ] Stopped target Login Prompts.
         Stopping Serial Getty on ttyAMA0...
         Stopping Snappy daemon...
         Stopping Getty on tty1...
         Stopping System Logging Service...
         Stopping Authorization Manager...

</SNIP>

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1731051

Title:
  (arm64) VM fails to properly reboot

Status in Ubuntu Cloud Archive:
  New
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Artful:
  Fix Committed

Bug description:
  [Impact]

   * Newer qemu crashes on older kernels (on arm) for using a feature that 
     was not supported by these older kernels.

   * Backport of a fix - also the detection code itself already exists in 
     qemu - this just makes sure that if the feature is not available that 
     the related function is not queued to prevent a crash.

  [Test Case]

   * (on arm64 for the actual case - is a no-change everywhere else)
     1. create a virtual machine that runs fine
     2. suspend it
        $ sudo virsh dompmsuspend ubuntu1710 --target mem
     3. wake it up
        $ sudo virsh dompmwakeup ubuntu1710
     => Before the fix this sequence crashed qemu as outlined in the initial 
        report below

  [Regression Potential]

   * This is only affecting arm (and thereby limiting regression to others) 
     as well as being a backport and no "change from scratch" (limiting risk 
     again). Then furthermore "all it does" is stop adding the ITS action
     which was a feature only added in Artfuls qemu. That said if there 
     would be a case were the detection would be non-perfect, even then the 
     user would just fall back to how it worked in zesty. That is a lot of 
     IFs (=unlikely) and even if so impact would hopefully be minimal.
     So I think the regression assessment is very low for this change.

  [Other Info]
   
   * Even more important for backports of this like Ubuntu Cloud Archive

  ---

  The Pike cloud archive has a regression, compared to Ocata, where in
  rebooting a VM via virsh causes the VM to powerdown, and then exit.
  The VM does not automatically power back up, but can be restarted.

  Repro:

  Install 16.04.3 on an ARM64 host
  Fully update the install
  add-apt-repository cloud-archive:pike
  apt-get update
  apt-get install qemu-efi virt-manager libvirt-bin qemu-guest-agent qemu-system-aarch64
  wget http://cdimage.ubuntu.com/ubuntu/releases/17.10/release/ubuntu-17.10-server-arm64.iso
  create a new session via ssh (session B)
  In session B: virt-install --accelerate --cdrom ubuntu-17.10-server-arm64.iso --disk size=10 --name ubuntu1710 --os-type linux --ram 1024
  Once the install completes and the guest is at the login prompt, in session A: virsh reboot ubuntu1710 --mode acpi

  Observed result:
  The guest will powerdown as expected (from logs on session B), and then session B will be dumped back to the host shell. "virsh list" will not show the ubuntu1710 domain.

  Expected result:
  The guest powers back on, and boots back to the login prompt.

  Analysis:
  We observe these errors in various logs:

  Nov 1 13:29:16 ubuntu libvirtd[2441]: 2017-11-01 20:29:16.882+0000: 2441: error : qemuMonitorIORead:595 : Unable to read from monitor: Connection reset by peer
  Nov 1 13:29:16 ubuntu libvirtd[2441]: 2017-11-01 20:29:16.882+0000: 3101: error : qemuMonitorJSONCommandWithFd:309 : internal error: Missing monitor reply object

  2017-11-01T20:29:16.538762Z qemu-system-aarch64: KVM_SET_DEVICE_ATTR
  failed: Group 4 attr 0x0000000000000001: No such device or address

  We debugged this to an issue in the QEMU in Pike being incompatible
  with the 4.10 kernel of 16.04.3. The QEMU in this version attempts to
  use the ITS migration functionality during reboot. 4.10 does not
  support this. When the IOCTL fails, QEMU calls abort(), thus killing
  the VM.

  We believe QEMU should not attempt to use this functionality if the
  host kernel does not support it. We suggest the attached patch to
  resolve the issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1731051/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list