[Bug 1894772] Re: live migration of windows 2012 r2 instance with virtio balloon driver fails from mitaka to queens.

Seyeong Kim 1894772 at bugs.launchpad.net
Thu Sep 10 10:09:54 UTC 2020


** Patch added: "lp1894772_bionic.debdiff"
   https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1894772/+attachment/5409318/+files/lp1894772_bionic.debdiff

** Changed in: qemu (Ubuntu Bionic)
       Status: New => In Progress

** Changed in: qemu (Ubuntu Bionic)
     Assignee: (unassigned) => Seyeong Kim (seyeongkim)

** Changed in: qemu (Ubuntu Focal)
       Status: New => In Progress

** Changed in: qemu (Ubuntu Focal)
     Assignee: (unassigned) => Seyeong Kim (seyeongkim)

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1894772

Title:
  live migration of windows 2012 r2 instance with virtio balloon driver
  fails from mitaka to queens.

Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Focal:
  In Progress
Status in qemu source package in Groovy:
  Fix Released

Bug description:
  [Impact]

  livemigration  of windows 2012 r2 instance with virtio balloon driver
  from qemu 2.5(mitaka) to qemu 2.11(queens) is not working properly.

  Especially instance keep moving e.g 2.5 -> 2.5 -> 2.11

  Then It shows below msg from the 2nd mitaka node.

  Migration: [ 94 %]error: internal error: qemu unexpectedly closed the monitor: 2020-09-07T07:45:11.799345Z qemu-system-x86_64: warning: Unknown firmware file in legacy mode: etc/msr_feature_control
  2020-09-07T07:45:12.765618Z qemu-system-x86_64: VQ 2 size 0x80 < last_avail_idx 0x1 - used_idx 0x2
  2020-09-07T07:45:12.765642Z qemu-system-x86_64: Failed to load virtio-balloon:virtio
  2020-09-07T07:45:12.765648Z qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:07.0/virtio-balloon'
  2020-09-07T07:45:12.766483Z qemu-system-x86_64: load of migration failed: Operation not permitted

  After patching for CVE-2016-5403, we did workaround with
  CVE-2015-5403-6.patch,

  [Test Case]

  Deploy 2 mitaka-staging machines kvm host
  Deploy 1 queens-staging machines kvm host

  Setting NFS server and client between them.

  Deploy windows 2012r2 guest instance with virtio balloon driver on one
  of the mitaka host

  Migrate it from mitaka to mitaka (it should be ok )
  Migrate it from mitaka to queens ( it raises error )

  I can reproduce this issue with baremetal or vm host

  [Regressions]
  As this patch is qemu related, current instance should be restarted to have this fix.
  Also, this patch may cause failure of vm starting, migrating related to virtio drivers.
  Especially Windows guest vm.

  [Others]

  Description: make sure vdev->vq[i].inuse never goes below 0
   This is a work-around to fix live migrations after the patches for
   CVE-2016-5403 were applied. The true root cause still needs to be
   determined.
  Origin: based on a patch by Len <lwhite at coreitx.com>
  Bug-Ubuntu: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1647389

  Index: qemu-2.5+dfsg/hw/virtio/virtio.c
  ===================================================================
  --- qemu-2.5+dfsg.orig/hw/virtio/virtio.c       2017-04-05 09:48:17.420025137 -0400
  +++ qemu-2.5+dfsg/hw/virtio/virtio.c    2017-04-05 09:49:59.565337543 -0400
  @@ -1510,6 +1510,7 @@
       for (i = 0; i < num; i++) {
           if (vdev->vq[i].vring.desc) {
               uint16_t nheads;
  +            int inuse_tmp;
               nheads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
               /* Check it isn't doing strange things with descriptor numbers. */
               if (nheads > vdev->vq[i].vring.num) {
  @@ -1527,12 +1528,15 @@
                * Since max ring size < UINT16_MAX it's safe to use modulo
                * UINT16_MAX + 1 subtraction.
                */
  -            vdev->vq[i].inuse = (uint16_t)(vdev->vq[i].last_avail_idx -
  +            inuse_tmp = (int)(vdev->vq[i].last_avail_idx -
                                   vring_used_idx(&vdev->vq[i]));
  +
  +            vdev->vq[i].inuse = (inuse_tmp < 0 ? 0 : inuse_tmp);
  +
               if (vdev->vq[i].inuse > vdev->vq[i].vring.num) {
  -                error_report("VQ %d size 0x%x < last_avail_idx 0x%x - "
  +                error_report("VQ %d inuse %u size 0x%x < last_avail_idx 0x%x - "
                                "used_idx 0x%x",
  -                             i, vdev->vq[i].vring.num,
  +                             i, vdev->vq[i].inuse, vdev->vq[i].vring.num,
                                vdev->vq[i].last_avail_idx,
                                vring_used_idx(&vdev->vq[i]));
                   return -1;

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1894772/+subscriptions



More information about the Ubuntu-sponsors mailing list