[Bug 1933092] Re: snat arp entry missing in qrouter namespace
OpenStack Infra
1933092 at bugs.launchpad.net
Thu Sep 23 19:54:15 UTC 2021
Reviewed: https://review.opendev.org/c/openstack/neutron/+/807244
Committed: https://opendev.org/openstack/neutron/commit/c689ac5a661a15ec7ad1098d21dc5affce97fb04
Submitter: "Zuul (22348)"
Branch: stable/train
commit c689ac5a661a15ec7ad1098d21dc5affce97fb04
Author: Hemanth Nakkina <hemanth.nakkina at canonical.com>
Date: Fri Jul 2 17:01:55 2021 +0530
Update arp entry of snat port on qrouter ns
In some cases, the arp entry of snat port is not updated
in qrouter namespace. l3-agent calls get_ports_by_subnet()
while setting arps for the subnet. And the snat port is
not returned if it is still unbound. One of the scenario
this is observed is when router is created, external
gateway set and internal subnet attached to router in
quick succession.
This patch retrieves snat port details from router info
as well and updates arp entry for snat port.
Conflicts:
neutron/agent/l3/dvr_local_router.py
Closes-Bug: #1933092
Change-Id: I7ee797b4b930306cf6360922d855f8b24f1b813d
(cherry picked from commit be7d0bb6abc893e53dfc864c52506928b1d38fa3)
(cherry picked from commit f1a9f4ed62fd2567cac174f80f87de53148ea7b9)
** Tags added: in-stable-train
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1933092
Title:
snat arp entry missing in qrouter namespace
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive ussuri series:
Fix Released
Status in Ubuntu Cloud Archive victoria series:
New
Status in Ubuntu Cloud Archive wallaby series:
New
Status in Ubuntu Cloud Archive xena series:
New
Status in neutron:
Fix Released
Status in neutron package in Ubuntu:
Fix Released
Status in neutron source package in Focal:
Fix Released
Status in neutron source package in Groovy:
New
Status in neutron source package in Hirsute:
New
Status in neutron source package in Impish:
Fix Released
Bug description:
[Impact]
Load Balancers deployed on the cloud are unreachable
[Test Case]
1. Deploy openstack with atleast 4 compute nodes with networking features DVR SNAT+L3HA
2. Execute the script test_snat_arp_entry.sh
3. The script loops for 20 times creating network, router and connecting router to external, internal network and checking if ARP entries are populated properly on qrouter namespaces
4. The script stops if arp entries are missing.
5. If the script runs for 20 loops, then there are no issues.
[Regression Potential]
The issue only happens a few times when a router is created, external gateway set and internal subnet attached to router in quick succession. In other cases, the arp entry of snat is already added.
The fix just adds extra logic to add arp entry retrieving snat information from the router. In working cases, this extra logic will execute commands to add arp entry twice which should not cause further issues.
[Original Bug Report]
In one of the cloud environment, the FIP attached to the Octavia Loadbalancer VIP is not reachable. After analysis, we found the ARP entry for SNAT IP is missing in the qrouter namespace where Amphora VM is running. And so the return packets are not forwarded from qrouter to snat on active l3-agent node.
Version:
Ubuntu Ussuri packages (16.3.2 point release)
DVR+SNAT+L3HA enabled
Expectation is to have PERMANENT arp entry for snat ip on qrouter namespace on all compute nodes
192.168.33.238 dev qr-4ee692e0-7a lladdr fa:16:3e:25:6a:73 used 38/38/38 probes 0 PERMANENT
How to reproduce:
Attaching a script to simulate the problem (without octavia) with following steps
1. network/subnet/router is created, network attached to router
2. verify if qrouter on all compute nodes has arp entries related to snat ip
3. if arp entries exists, delete network/subnet/router
4. Repeat steps 1,2,3 until missing arp entry is observed.
I am able to reproduce missing arp entry sometimes in 3rd loop and
sometimes in 6th loop.
Observed arp entries for snat ip is updated at the following places
[1] [2] but get_snat_interfaces() and get_ports_by_subnet() are not
updated with snat ip in non-working cases.
[1] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L570
[2] https://opendev.org/openstack/neutron/src/commit/dfd04115b059c2263cdd8ac44ccc2ec47614bcc3/neutron/agent/l3/dvr_local_router.py#L317
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1933092/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list