[Bug 1894843] Re: [dvr_snat] Router update deletes rfp interface from qrouter even when VM port is present on this host

Hemanth Nakkina 1894843 at bugs.launchpad.net
Tue Mar 16 10:35:36 UTC 2021


Verified on groovy-proposed and fix is working fine as per the testcase

Ping to floating ip failed when router is disabled and succeed a few
milliseconds after the router is enabled.

$ ping 10.5.150.53
PING 10.5.150.53 (10.5.150.53) 56(84) bytes of data.
64 bytes from 10.5.150.53: icmp_seq=1 ttl=62 time=0.788 ms
64 bytes from 10.5.150.53: icmp_seq=2 ttl=62 time=0.809 ms
64 bytes from 10.5.150.53: icmp_seq=3 ttl=62 time=1.07 ms
64 bytes from 10.5.150.53: icmp_seq=4 ttl=62 time=0.740 ms
64 bytes from 10.5.150.53: icmp_seq=5 ttl=62 time=0.919 ms
64 bytes from 10.5.150.53: icmp_seq=6 ttl=62 time=0.893 ms
64 bytes from 10.5.150.53: icmp_seq=7 ttl=62 time=0.901 ms
64 bytes from 10.5.150.53: icmp_seq=8 ttl=62 time=0.838 ms
>From 10.5.153.226 icmp_seq=11 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=12 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=13 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=14 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=15 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=16 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=18 Redirect Host(New nexthop: 53.150.5.10)
>From 10.5.153.226 icmp_seq=19 Destination Host Unreachable
>From 10.5.153.226 icmp_seq=20 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=21 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=22 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=23 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=24 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=25 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=26 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=27 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=28 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=29 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=30 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=31 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=32 Destination Host Unreachable
>From 10.5.0.5 icmp_seq=33 Destination Host Unreachable
64 bytes from 10.5.150.53: icmp_seq=34 ttl=62 time=2248 ms
64 bytes from 10.5.150.53: icmp_seq=36 ttl=62 time=200 ms
64 bytes from 10.5.150.53: icmp_seq=35 ttl=62 time=1224 ms

** Tags removed: verification-needed-groovy
** Tags added: verification-done-groovy

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1894843

Title:
  [dvr_snat] Router update deletes rfp interface from qrouter even when
  VM port is present on this host

Status in Ubuntu Cloud Archive:
  Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
  Fix Committed
Status in Ubuntu Cloud Archive victoria series:
  Fix Committed
Status in neutron:
  Fix Released
Status in neutron package in Ubuntu:
  Fix Released
Status in neutron source package in Focal:
  Fix Committed
Status in neutron source package in Groovy:
  Fix Committed
Status in neutron source package in Hirsute:
  Fix Released

Bug description:
  [Impact]
  When neutron schedules snat namespaces it sometimes deletes the rfp interface from qrouter namespaces which breaks external network (fip) connectivity. The fix prevents this from happening.

  [Test Case]
   * deploy Openstack (Ussuri or above) with dvr_snat enabled in compute hosts.
   * ensure min. 2 compute hosts
   * create one ext network and one private network
   * add private subnet to router and ext as gateway
   * check which compute has the snat ns (ip netns| grep snat)
   * create a vm on each compute host
   * check that qrouter ns on both computes has rfp interface
   * ip netns| grep qrouter; ip netns exec <ns> ip a s| grep rfp
   * disable and re-enable router
   * openstack router set --disable <router>;  openstack router set --enable <router>
   * check again
   * ip netns| grep qrouter; ip netns exec <ns> ip a s| grep rfp

  [Where problems could occur]
  This patch is in fact restoring expected behaviour and is not expected to
  introduce any new regressions.

  -------------------------------------------------------------------------

  Hello,

  In the case of dvr_snat l3 agents are deployed on hypervisors there
  can be race condition. The agent creates snat namespaces on each
  scheduled host and removes them at second step. At this second step
  agent removes the rfp interface from qrouter even when there is VM
  with floating IP on the host.

  When VM is deployed at the time of second step we can lost external
  access to VMs floating IP. The issue can be reproduced by hand:

  1. Create tenant network and router with external gateway
  2. Create VM with floating ip
  3. Ensure that VM on the hypervisor without snat-* namespace
  4. Set the router to disabled state (openstack router set --disable <router>)
  5. Set the router to enabled state (openstack router set --enabled <router>)
  6. The external access to VMs FIP have lost because L3 agent creates the qrouter namespace without rfp interface.

  Environment:

  1. Neutron with ML2 OVS plugin.
  2. L3 agents in dvr_snat mode on each hypervisor
  3. openstack-neutron-common-15.1.1-0.20200611111910.7d97420.el8ost.noarch

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1894843/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list