[Bug 1641532] Re: machine-types trusty and utopic are not unique (depend on the qemu version)

ChristianEhrhardt 1641532 at bugs.launchpad.net
Mon Feb 20 08:02:46 UTC 2017


Hi,
I contacted smoser who was hacking on open-iscsi recently due to related reasons.
I'll quote his mail for the record, but the TL;DR is - we should be fine ignoring the fail results of open-iscsi in yakkety.

Reasons:
- Test known to be unreliable
- works when reproduced in local ADT
- Xenial test is set to ignored failure as well

Actions:
- I think this should be marked ignored failure as well in Yakkety

Xenial: http://people.canonical.com/~ubuntu-archive/proposed-migration/xenial/update_excuses.html#qemu
Yakkety: http://people.canonical.com/~ubuntu-archive/proposed-migration/yakkety/update_excuses.html#qemu

Quote of Scotts detailed reply at http://paste.ubuntu.com/24032648/

With open-iscsi in Yakkety being ignorable and all other tests are good
in autopkgtest, QA and Migration tests setting verification-done.


** Tags removed: verification-needed
** Tags added: verification-done verification-done-xenial verification-done-yakkety

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1641532

Title:
  machine-types trusty and utopic are not unique (depend on the qemu
  version)

Status in Ubuntu Cloud Archive:
  Fix Committed
Status in Ubuntu Cloud Archive liberty series:
  In Progress
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  Fix Committed
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Guests that were created with the Trusty (or Utopic) machine type are
     not unique. Due to that on migrations between the qemu versions of
     multiple Ubuntu releases migrations fail

   * Many migrations work just by luck, one that fails is the Utopic Type on
     Cloud-Archive Liberty to Xenial.

   * The further one migrates that old guest the more breakage this
     accumulates. E.g. a Trusty guest (qemu 2.0) migrated to Xenial
     (thinks it is qemu 2.5 type) and from there migrating to Yakkety
     (which expects it to be a 2.6 type).

   * The fix is minimal and makes the types definition stable across
     releases as they were intended

  [Test Case]

   * Spawn a Guest of Type Utopic (the easiest still supported is Trusty +
     Cloud Archive Liberty). And then migrate it to Xenial. Without the
     patch migration fail, with the patches it works as both now agree
     what the guest definition means. Please do note that both ends of the
     migrations have to be fixed to get it working.

   * Similar a Guest of type Trusty can with the fixes applied be migrated
     X->Y->Z and back to X, while without the fix any backward way (and
     probably a future forward way) will fail.

   * Note: This is complex and there are many potential combinations - the
     testlogs (comment #15) attached have various permutations of those.
     It has shown that not only the Utopic type issue that was reported
     gets fixed, but several backward migrations as well.

  [Regression Potential]

   * While it fixes the cases that we know, and as testing showed also
     several cases that we didn't know before there are two things we can
     not avoid.
     1. People have to restart the source guests so that the new fixed
        definition will take effect.
        But Trusty guests that were already migrated to a Host that has
        the error will have to be restarted before they can be migrated
        further.
        Note: no one has to restart guests on Trusty without Cloud
        Archive; there the Trusty type is ok - it is a 2.0 which after
        the fix Xenial/Yakkety agree.

     2. Restarting the guests after the fix will "downgrade" the virtual
        hardware. One can think of the machine types as the HW-revision of
        the virtual HW. A Guest that was created as e.g. Trusty these days
        on Xenial as incorrectly "too new" virtual HW, restarting the
        guest will fix that - but as part of that new attributes that it
        incorrectly gained when migrating/moving to the new host will be
        taken away (to match the definition the guest had when it was
        started)
        This is actually a fix, but might appear as a regression to
        somebody without knowing what was going on.
        Also anybody that "wants" the new HW can just upgrade the machine
        type to get it, which is actually recommended anyway [1].

  [Other Info]

   * This is a complex issue, please catch me (cpaelzer) on IRC if you
     need/want to go into detail.
     Or for Cloud Archive questions coreycb.

  [1]: https://wiki.ubuntu.com/QemuKVMMigration#Upgrade_machine_type

  --- original description ---

  Hi,

  I'm currently live-migrating many VMs from an old server to a new one,
  and some VM can't be live migrated.

  The source host is trusty with qemu-system-x86 1:2.3+dfsg-5ubuntu9.4~cloud2
  The destination host is xenial with qemu-system-x86 1:2.5+dfsg-5ubuntu10.6

  When the issue occurs, the destination host raises an error [1] and
  stop the migration process.

  The only difference I see between VMs where live migration works and
  those were it doesn't work is a different machine type.

  * migration works when VM have been created with pc-i440fx-vivid
  * migration doesn't work when VM have been created with pc-i440fx-utopic

  [1] the qemu error report by libvirt on the destination host

  2016-11-14 08:25:40.774+0000: starting up libvirt version: 1.3.1, package: 1ubuntu10.5 (Stefan Bader <stefan.bader at canonical.com> Thu, 06 Oct 2016 13:07:20 +0200), qemu version: 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.6), hostname: n7.tetaneutral.net
  LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=spice /usr/bin/kvm-spice -name a81e7133-9601-4432-86dd-a2401dcad8c2 -S -machine pc-i440fx-utopic,accel=kvm,usb=off -cpu Nehalem -m 256 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid a81e7133-9601-4432-86dd-a2401dcad8c2 -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2015.1.2,serial=fe2641d2-543a-4d65-b75f-6337bf4b8744,uuid=a81e7133-9601-4432-86dd-a2401dcad8c2' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-a81e7133-9601-4432-86dd-a2401dcad8c2/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 'file=rbd:disks/a81e7133-9601-4432-86dd-a2401dcad8c2_disk.config:id=openstack-service:key=XXXXXXXXXXXX==:auth_supported=cephx\;none:mon_host=192.168.99.251\:6789\;192.168.99.252\:6789\;192.168.99.253\:6789,format=raw,if=none,id=drive-ide0-1-1,readonly=on,cache=none,aio=native' -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -drive 'file=rbd:disks/volume-a82bb407-5ccb-4b0e-ba68-a0de1cd58cc3:id=openstack-service:key=XXXXXXXXXXXXXXXXXX:auth_supported=cephx\;none:mon_host=192.168.99.251\:6789\;192.168.99.252\:6789\;192.168.99.253\:6789,format=raw,if=none,id=drive-virtio-disk0,serial=a82bb407-5ccb-4b0e-ba68-a0de1cd58cc3,cache=none,aio=native' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=43,id=hostnet0,vhost=on,vhostfd=45 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:37:fb:ba,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/a81e7133-9601-4432-86dd-a2401dcad8c2/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -spice port=5916,addr=0.0.0.0,disable-ticketing,seamless-migration=on -k en-us -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
  2016-11-14T08:25:44.216364Z qemu-system-x86_64: Unknown ramblock "/rom at etc/acpi/rsdp", cannot accept migration
  2016-11-14T08:25:44.216394Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram'
  2016-11-14T08:25:44.216509Z qemu-system-x86_64: load of migration failed: Invalid argument

  Cheers,

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1641532/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list