[Bug 1994002] Re: [SRU] migration was active, but no RAM info was set
Mauricio Faria de Oliveira
1994002 at bugs.launchpad.net
Thu Mar 30 20:20:30 UTC 2023
Verification done for focal-proposed.
focal-updates: FAIL (status: active)
(qemu) info migrate
...
Migration status: active
total time: 0 milliseconds
focal-proposed: PASS (status: setup)
(qemu) info migrate
...
Migration status: setup
total time: 0 milliseconds
Details:
=======
$ lsb_release -cs
focal
focal-updates: FAIL
-------------
$ curl http://ddebs.ubuntu.com/dbgsym-release-key.asc | sudo apt-key add -
$ sudo add-apt-repository -y 'deb http://ddebs.ubuntu.com/ubuntu focal-updates main'
$ sudo apt install --yes qemu-system-x86 qemu-system-x86-dbgsym
$ dpkg -s qemu-system-x86 | grep Version:
Version: 1:4.2-3ubuntu6.24
$ dpkg -s qemu-system-x86-dbgsym | grep Version:
Version: 1:4.2-3ubuntu6.24
...
$ sudo add-apt-repository -ys 'deb http://archive.ubuntu.com/ubuntu
focal-updates main'
$ apt source qemu
$ head -n1 qemu-*/debian/changelog
qemu (1:4.2-3ubuntu6.24) focal-security; urgency=medium
$ vim qemu-*/migration/migration.c
915 static void fill_source_migration_info(MigrationInfo *info)
...
925 case MIGRATION_STATUS_SETUP:
926 info->has_status = true;
927 info->has_total_time = false;
928 break;
...
T1)
$ qemu-system-x86_64 -nodefaults -nographic -S -incoming tcp:0:4444
T2)
gdb \
-ex 'set non-stop on' -ex 'set pagination off' -ex 'set confirm off' \
qemu-system-x86_64
(gdb) b migrate_set_state
Breakpoint 1 at 0x6d3aa0: migrate_set_state. (2 locations)
(gdb) b migration/migration.c:928
Breakpoint 2 at 0x6d317b: file ./migration/migration.c, line 928.
(gdb) run -nodefaults -nographic -S -monitor tcp:0:3333,server,wait=off
T3)
nc 127.0.0.1 3333
(qemu) migrate -d tcp:127.0.0.1:4444
T2)
Thread 1 "qemu-system-x86" hit Breakpoint 1, migrate_set_state (state=0x5555566949d8, old_state=0, new_state=1) at ./migration/migration.c:1463
1463 ./migration/migration.c: No such file or directory.
(gdb) p (MigrationStatus) 0
$1 = MIGRATION_STATUS_NONE
(gdb) p (MigrationStatus) 1
$2 = MIGRATION_STATUS_SETUP
(gdb) c
Thread 5 "qemu-system-x86" hit Breakpoint 1, migrate_set_state (state=0x5555566949d8, old_state=1, new_state=4) at ./migration/migration.c:1463
1463 in ./migration/migration.c
(gdb) p (MigrationStatus) 1
$3 = MIGRATION_STATUS_SETUP
(gdb) p (MigrationStatus) 4
$4 = MIGRATION_STATUS_ACTIVE
(gdb)
T3)
(qemu) info migrate
T2)
Thread 1 "qemu-system-x86" hit Breakpoint 2, fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
928 in ./migration/migration.c
(gdb) p (MigrationStatus) s.state
$6 = MIGRATION_STATUS_SETUP
(gdb) p info.status
$7 = MIGRATION_STATUS_NONE
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff5ee55c0 (LWP 5066) "qemu-system-x86" fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
2 Thread 0x7ffff5ee1700 (LWP 5070) "qemu-system-x86" (running)
3 Thread 0x7ffff565f700 (LWP 5071) "qemu-system-x86" (running)
5 Thread 0x7fffedfff700 (LWP 5075) "qemu-system-x86" migrate_set_state (state=0x5555566949d8, old_state=1, new_state=4) at ./migration/migration.c:1463
(gdb) thread 5
[Switching to thread 5 (Thread 0x7fffedfff700 (LWP 5075))]
#0 migrate_set_state (state=0x5555566949d8, old_state=1, new_state=4) at ./migration/migration.c:1463
1463 in ./migration/migration.c
(gdb) continue &
Continuing.
(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff5ee55c0 (LWP 5066) "qemu-system-x86" fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
2 Thread 0x7ffff5ee1700 (LWP 5070) "qemu-system-x86" (running)
3 Thread 0x7ffff565f700 (LWP 5071) "qemu-system-x86" (running)
* 5 Thread 0x7fffedfff700 (LWP 5075) "qemu-system-x86" (running)
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff5ee55c0 (LWP 5066))]
#0 fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
928 in ./migration/migration.c
(gdb) p (MigrationStatus) s.state
$8 = MIGRATION_STATUS_ACTIVE
(gdb) c
T3)
(qemu) info migrate
info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
Migration status: active
total time: 0 milliseconds
(qemu)
Migration status is active, without any RAM statistics.
(qemu) quit
(gdb) quit
Terminal 1)
Ctrl-C
focal-proposed: PASS
--------------
$ sudo add-apt-repository -ys 'deb http://archive.ubuntu.com/ubuntu focal-proposed main'
$ sudo add-apt-repository -y 'deb http://ddebs.ubuntu.com/ubuntu focal-proposed main'
$ sudo apt install --yes qemu-system-x86 qemu-system-x86-dbgsym
$ dpkg -s qemu-system-x86 | grep Version:
Version: 1:4.2-3ubuntu6.25
$ dpkg -s qemu-system-x86-dbgsym | grep Version:
Version: 1:4.2-3ubuntu6.25
...
$ apt source qemu
$ head -n1 qemu-*/debian/changelog
qemu (1:4.2-3ubuntu6.25) focal; urgency=medium
$ vim qemu-*/migration/migration.c
915 static void fill_source_migration_info(MigrationInfo *info)
...
926 case MIGRATION_STATUS_SETUP:
927 info->has_status = true;
928 info->has_total_time = false;
929 break;
...
T1)
$ qemu-system-x86_64 -nodefaults -nographic -S -incoming tcp:0:4444
T2)
gdb \
-ex 'set non-stop on' -ex 'set pagination off' -ex 'set confirm off' \
qemu-system-x86_64
(gdb) b migrate_set_state
Breakpoint 1 at 0x6d3b80: migrate_set_state. (2 locations)
(gdb) b migration/migration.c:928
Breakpoint 2 at 0x6d32ad: file ./migration/migration.c, line 928.
(gdb) run -nodefaults -nographic -S -monitor tcp:0:3333,server,wait=off
T3)
nc 127.0.0.1 3333
(qemu) migrate -d tcp:127.0.0.1:4444
T2)
Thread 1 "qemu-system-x86" hit Breakpoint 1, migrate_set_state (state=0x5555566949d8, old_state=0, new_state=1) at ./migration/migration.c:1464
1464 ./migration/migration.c: No such file or directory.
(gdb) p (MigrationStatus) 0
$1 = MIGRATION_STATUS_NONE
(gdb) p (MigrationStatus) 1
$2 = MIGRATION_STATUS_SETUP
(gdb) c
Continuing.
[New Thread 0x7fffedfff700 (LWP 6990)]
[New Thread 0x7fffed7fe700 (LWP 6991)]
[Thread 0x7fffedfff700 (LWP 6990) exited]
Thread 5 "qemu-system-x86" hit Breakpoint 1, migrate_set_state (state=0x5555566949d8, old_state=1, new_state=4) at ./migration/migration.c:1464
1464 in ./migration/migration.c
(gdb) p (MigrationStatus) 1
$3 = MIGRATION_STATUS_SETUP
(gdb) p (MigrationStatus) 4
$4 = MIGRATION_STATUS_ACTIVE
(gdb)
T3)
(qemu) info migrate
T2)
Thread 1 "qemu-system-x86" hit Breakpoint 2, fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
928 in ./migration/migration.c
(gdb) p (MigrationStatus) s.state
$6 = MIGRATION_STATUS_SETUP
(gdb) p info.status
$7 = MIGRATION_STATUS_NONE
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff5ee55c0 (LWP 6983) "qemu-system-x86" fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
2 Thread 0x7ffff5ee1700 (LWP 6987) "qemu-system-x86" (running)
3 Thread 0x7ffff565f700 (LWP 6988) "qemu-system-x86" (running)
5 Thread 0x7fffed7fe700 (LWP 6991) "qemu-system-x86" migrate_set_state (state=0x5555566949d8, old_state=1, new_state=4) at ./migration/migration.c:1464
(gdb) thread 5
[Switching to thread 5 (Thread 0x7fffed7fe700 (LWP 6991))]
#0 migrate_set_state (state=0x5555566949d8, old_state=1, new_state=4) at ./migration/migration.c:1464
1464 in ./migration/migration.c
(gdb) continue &
Continuing.
(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff5ee55c0 (LWP 6983) "qemu-system-x86" fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
2 Thread 0x7ffff5ee1700 (LWP 6987) "qemu-system-x86" (running)
3 Thread 0x7ffff565f700 (LWP 6988) "qemu-system-x86" (running)
* 5 Thread 0x7fffed7fe700 (LWP 6991) "qemu-system-x86" (running)
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff5ee55c0 (LWP 6983))]
#0 fill_source_migration_info (info=0x555556850590) at ./migration/migration.c:928
928 in ./migration/migration.c
(gdb) p (MigrationStatus) s.state
$8 = MIGRATION_STATUS_ACTIVE
(gdb) c
T3)
(qemu) info migrate
info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
Migration status: setup
total time: 0 milliseconds
Status is now still 'SETUP' (which is not expected to have RAM
statistics), not 'ACTIVE' (which is, and caused the issue).
(qemu) quit
(gdb) quit
Terminal 1)
Ctrl-C
** Tags removed: verification-needed-focal
** Tags added: verification-done-focal
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1994002
Title:
[SRU] migration was active, but no RAM info was set
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive ussuri series:
New
Status in qemu package in Ubuntu:
Fix Released
Status in qemu source package in Bionic:
Fix Committed
Status in qemu source package in Focal:
Fix Committed
Status in qemu source package in Jammy:
Fix Committed
Status in qemu source package in Kinetic:
Fix Released
Bug description:
[Impact]
* While live-migrating many instances concurrently, libvirt sometimes
return `internal error: migration was active, but no RAM info was
set:`
* Effects of this bug are mostly observed in large scale clusters
with a lot of live migration activity.
* Has second order effects for consumers of migration monitor such as
libvirt and openstack.
[Test Case]
Synthetic reproducer with GDB in comment #21.
Steps to Reproduce:
1. live evacuate a compute
2. live migration of one or more instances fails with the above error
N.B Due to the nature of this bug it is difficult consistently reproduce.
In an environment where it has been observed it is estimated to occur approximately 1/1000 migrations.
[Where problems could occur]
* In the event of a regression the migration monitor may report an inconsistent state.
[Original Bug Description]
While live-migrating many instances concurrently, libvirt sometimes return internal error: migration was active, but no RAM info was set:
~~~
2022-03-30 06:08:37.197 7 WARNING nova.virt.libvirt.driver [req-5c3296cf-88ee-4af6-ae6a-ddba99935e23 - - - - -] [instance: af339c99-1182-4489-b15c-21e52f50f724] Error monitoring migration: internal error: migration was active, but no RAM info was set: libvirt.libvirtError: internal error: migration was active, but no RAM info was set
~~~
From upstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=2074205
[Other Information]
Related bug: https://bugs.launchpad.net/nova/+bug/1982284
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1994002/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list