[Bug 1825147] Re: ovs flooding packets, not learning MAC addresses
Junien Fridrick
1825147 at bugs.launchpad.net
Wed Apr 17 12:10:32 UTC 2019
An additional datapoint : MAC learning appears to be working fine for
subnets not attached to a router. As soon as I attach the subnet to a
router, the bad behaviour starts.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1825147
Title:
ovs flooding packets, not learning MAC addresses
Status in neutron:
New
Status in neutron package in Ubuntu:
New
Bug description:
Hi,
Using OpenStack rocky on Ubuntu 18.04, with dvr_snat and L3HA, and
using the openvswitch firewall driver. openvswitch version
2.10.0-0ubuntu2~cloud0. Deployed with juju.
I was doing load testing by creating a bunch of instances, and noticed
that the network throughput available to instances dropped
dramatically as I was creating VMs. In other words, with 2 VMs on my
cloud, I had pretty good bandwith, but with 100 (idle) VMs, bandwidth
became ridiculously slow.
Investigating the problem, I noticed that ovs was flooding traffic :
all instances of an hypervisor were getting all the traffic destined
to any VM on another hypervisor.
In other words, I had vmA1 and vmA2 on hypervisor A, and vmB1 on
hypervisor B, then TCP traffic between vmA1 and vmB1 could be seen on
vmA2.
Digging more into this, I think I located the problem in the ovs MAC
learning process, more specifically on br-int (using "sudo ovs-appctl
fdb/show br-int").
Traffic flow from vmA1 to vmB1, on hypervisor A, looks like : tap (on
br-int), patch-tun (on br-int), patch-int (on br-tun), vxlan to
hypervisor B.
So whenever traffic comes back (the other way around), the MAC address
of vmB1 should be learned, on br-int, on the patch-tun port - and that
is not the case. So whenever vmA1 sends traffic to vmB1, at some point
it reaches the "NORMAL" action, and since the destination MAC is not
learned, traffic is getting flooded : see ofproto/trace
https://pastebin.ubuntu.com/p/mbrrj4wPxY/ (see "no learned MAC for
destination, flooding")
Digging more into this, it would appear that ovs learns a MAC address
only from broadcast ARP requests, and not from ARP requests with a
unicast MAC address (which is what Linux uses after a successful
broadcast ARP request) : https://pastebin.ubuntu.com/p/Sfq775cX6V/.
Once the MAC is learned, there's no more flooding :
https://pastebin.ubuntu.com/p/bBNHrRKndg/ (see "forwarding to learned
port" instead of "no learned MAC for destination, flooding").
Flooding has security consequences (VMs can see traffic not destined
to them - although only traffic for VMs in the same neutron network),
and performance consequences, so it should be avoided.
Thanks
To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1825147/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list