[Bug 997978] Re: KVM images lose connectivity with bridged network
Gary Cuozzo
gary at isgsoftware.net
Tue Sep 25 16:31:05 UTC 2012
I don't believe just rebooting a guest will cause a new KVM instance to
load. As a test, I just rebooted a guest VM on a system here and the
pid of the kvm process did not change. I think it may be possible that
you are still running on the old software.
Also, to update my data point... On my server which was experiencing
issues, I rebooted the host just to make sure everything was fresh.
It's been about a month and I have not experienced the failure again. I
was typically going a few weeks between issues.
gary
----- Original Message -----
From: "Matt Hilt" <mjhilt at gmail.com>
To: gary at isgsoftware.net
Sent: Tuesday, September 25, 2012 12:07:44 PM
Subject: [Bug 997978] Re: KVM images lose connectivity with bridged network
Soren,
We have a 12.04 based OpenStack cluster with 4 host nodes running about 30 VMs currently.
We performed the steps to add the kvm-network-hang repo and updated to the latest version on the host machines, then rebooted the instances. My understanding is that this should catch the update, since a new KVM command is run on reboot.
I caught the first failure ~12 hours after the upgrade. It had the usual
symptoms: networking loss, but the VM is still up and an active VNC
session was possible. I thought I just might have missed a reboot on one
of the VMs, so I didn't report anything. The second failure happened
yesterday, but someone else caught it and rebooted the VM. As best we
can tell after the fact, it looks like the usual failure (no full
harddrive, or kernel panic, or anything that got logged).
As I mentioned before, we used to see at least one failure per day,
usually much more. This patch has at least reduced the occurence to a
minimal amount. These non-deterministic bugs are hard to track down.
--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/997978
Title:
KVM images lose connectivity with bridged network
Status in OpenStack Compute (Nova):
Invalid
Status in “qemu-kvm” package in Ubuntu:
Fix Released
Status in “qemu-kvm” source package in Precise:
In Progress
Bug description:
=========================================
SRU Justification:
1. Impact: networking breaks after awhile in kvm guests using virtio networking
2. Development fix: The bug was fixed upstream and the fix picked up in a new
merge.
3. Stable fix: 3 virtio patches are cherrypicked from upstream:
a821ce5 virtio: order index/descriptor reads
92045d8 virtio: add missing mb() on enable notification
a281ebc virtio: add missing mb() on notification
4. Test case: Create a bridge enslaving the real NIC, and use that as the bridge
for a kvm instance with virtio networking. See comment #44 for specific test
case.
5. Regression potential: Should be low as several people have tested the fixed
package under heavy load.
=========================================
System:
-----------
Dell R410 Dual processor 2.4Ghz w/16G RAM
Distributor ID: Ubuntu
Description: Ubuntu 12.04 LTS
Release: 12.04
Codename: precise
Setup:
---------
We're running 3 KVM guests, all Ubuntu 12.04 LTS using bridged networking.
From the host:
# cat /etc/network/interfaces
auto br0
iface br0 inet static
address 212.XX.239.98
netmask 255.255.255.240
gateway 212.XX.239.97
bridge_ports eth0
bridge_fd 9
bridge_hello 2
bridge_maxage 12
bridge_stp off
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr d4:ae:52:84:2d:5a
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:11278363 errors:0 dropped:3128 overruns:0 frame:0
TX packets:14437384 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4115980743 (4.1 GB) TX bytes:5451961979 (5.4 GB)
Interrupt:36 Memory:da000000-da012800
# ifconfig br0
br0 Link encap:Ethernet HWaddr d4:ae:52:84:2d:5a
inet addr:212.XX.239.98 Bcast:212.XX.239.111 Mask:255.255.255.240
inet6 addr: fe80::d6ae:52ff:fe84:2d5a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1720861 errors:0 dropped:0 overruns:0 frame:0
TX packets:1708622 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:210152198 (210.1 MB) TX bytes:300858508 (300.8 MB)
# brctl show
bridge name bridge id STP enabled interfaces
br0 8000.d4ae52842d5a no eth0
I have no default network configured to autostart in libvirt as we're using bridged networking:
# virsh net-list --all
Name State Autostart
-----------------------------------------
default inactive no
# arp
Address HWtype HWaddress Flags Mask Iface
mailer03.xxxx.com ether 52:54:00:82:5f:0f C br0
mailer01.xxxx.com ether 52:54:00:d2:f7:31 C br0
mailer02.xxxx.com ether 52:54:00:d3:8f:91 C br0
dxi-gw2.xxxx.com ether 00:1a:30:2a:b1:c0 C br0
From one of the guests:
<domain type='kvm' id='4'>
<name>mailer01</name>
<uuid>d41d1355-84e8-ae23-e84e-227bc0231b97</uuid>
<memory>2097152</memory>
<currentMemory>2097152</currentMemory>
<vcpu>1</vcpu>
<os>
<type arch='x86_64' machine='pc-1.0'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/dev/mapper/vg_main-mailer01--root'/>
<target dev='hda' bus='ide'/>
<alias name='ide0-0-0'/>
<address type='drive' controller='0' bus='0' unit='0'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/dev/mapper/vg_main-mailer01--swap'/>
<target dev='hdb' bus='ide'/>
<alias name='ide0-0-1'/>
<address type='drive' controller='0' bus='0' unit='1'/>
</disk>
<controller type='ide' index='0'>
<alias name='ide0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<interface type='bridge'>
<mac address='52:54:00:d2:f7:31'/>
<source bridge='br0'/>
<target dev='vnet0'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/0'/>
<target port='0'/>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/0'>
<source path='/dev/pts/0'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
<listen type='address' address='127.0.0.1'/>
</graphics>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='apparmor' relabel='yes'>
<label>libvirt-d41d1355-84e8-ae23-e84e-227bc0231b97</label>
<imagelabel>libvirt-d41d1355-84e8-ae23-e84e-227bc0231b97</imagelabel>
</seclabel>
</domain>
From within the guest:
# cat /etc/network/interfaces
# The primary network interface
auto eth0
iface eth0 inet static
address 212.XX.239.100
netmask 255.255.255.240
network 212.XX.239.96
broadcast 212.XX.239.111
gateway 212.XX.239.97
# ifconfig
eth0 Link encap:Ethernet HWaddr 52:54:00:d2:f7:31
inet addr:212.XX.239.100 Bcast:212.XX.239.111 Mask:255.255.255.240
inet6 addr: fe80::5054:ff:fed2:f731/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5631830 errors:0 dropped:0 overruns:0 frame:0
TX packets:6683416 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2027322829 (2.0 GB) TX bytes:2076698690 (2.0 GB)
A commandline which starts the KVM guest:
/usr/bin/kvm -S -M pc-1.0 -enable-kvm -m 2048 -smp 1,sockets=1,cores=1,threads=1 -name mailer01 -uuid d41d1355-84e8-ae23-e84e-227bc0231b97 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mailer01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -drive file=/dev/mapper/vg_main-mailer01--root,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive file=/dev/mapper/vg_main-mailer01--swap,if=none,id=drive-ide0-0-1,format=raw -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -netdev tap,fd=18,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d2:f7:31,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
Problem:
------------
Periodically (at least once a day), one or more of the guests lose network connectivity. Ping responds with 'host unreachable', even from the dom host. Logging in via the serial console shows no problems: eth0 is up, can ping the local host, but no outside connectivity. Restart the network (/etc/init.d/networking restart) does nothing. Reboot the machine and it comes alive again.
I've verified there's no arp games going on on the primary host (the
arp tables remain the same before - when it had connectivity - and
after - when it doesn't.
This is a critical issue affecting production services on the latest
LTS release of Ubuntu. It's similar to an issue which was 'resolved'
in 10.04 but appears to have risen its ugly head again.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/997978/+subscriptions
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to qemu-kvm in Ubuntu.
https://bugs.launchpad.net/bugs/997978
Title:
KVM images lose connectivity with bridged network
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/997978/+subscriptions
More information about the Ubuntu-server-bugs
mailing list