[Bug 1976285] Re: SNAT/DNAT - Traffic sent to LRP port recirculate until TTL=0 (drop recirc action)
Bug Watch Updater
1976285 at bugs.launchpad.net
Mon May 30 15:57:08 UTC 2022
** Changed in: openvswitch
Status: Unknown => New
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1976285
Title:
SNAT/DNAT - Traffic sent to LRP port recirculate until TTL=0 (drop
recirc action)
Status in openvswitch:
New
Status in ovn package in Ubuntu:
New
Bug description:
Hey there,
I'm looking through the docs quite extensively for references on how
SNAT and DNAT flow work to try to understand the problem related to
the issues reported in the links below:
https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967718
https://mail.openvswitch.org/pipermail/ovs-dev/2021-August/386720.html
I can see these same log messages "kernel: openvswitch: ovs-system:
deferred action limit reached, drop recirc action" on gateway nodes in
my OpenStack installation.
The main problem is related to TCP/UDP traffic sent to the address of
an LRP port that is not part of any SNAT/DNAT conversation, it will
keep recirculating in the OVS data plane until TTL is 0.
The message is shown in the kernel log due to the size of the FIFO
"DEFERRED_ACTION_FIFO_SIZE", but this is a consequence of the packets
not matching the flow tables of the datapath. See kernel -
net/openvswitch/actions.c
I can reproduce on a local ovn/ovs installation building the ovn
main/master branch and ovs submodule(github projetcs). This problem
also occurs in all the latest released tags from OVN and OVS for
Ubuntu 20.04 LTS.
Basically, it only happens when there is a SNAT rule to translate an
entire network (masquerade) and the return traffic does not have an
open port. If a DNAT is used for a specific host (even if the ports
have not been mapped, but if there is a 'host' to redirect the DNAT),
the traffic is forwarded and is not sent via netlink through the
slowpath until it is dropped.
The patch proposed by Krzysztof Klimonda aims to modify the flow table
via OVN communication - inserting a drp rule for traffic related to
this issue. This patch was not accepted in the project, but it made me
intrigued as to how to solve this problem (I can't just increase the
kernel DEFERRED_ACTION_FIFO_SIZE). The proposed patch is very old and
does not apply to the current code structure. I tried to adapt ovn-
northd.c to the new northd/northd.c format and applied it to upstream,
but the problem still occurs.
ovn_upstream.txt[https://github.com/openvswitch/ovs-
issues/files/8798982/ovn_upstream.txt]
I believe the patch does not solve the problem because I keep seeing
messages in the log.
Do you have any ideas on how to solve this problem?
I am adding a reproducer for this issue in the attached file.
issue_reproducer.txt[https://github.com/openvswitch/ovs-issues/files/8798161/issue_reproducer.txt]
Kind regards,
Roberto
To manage notifications about this bug go to:
https://bugs.launchpad.net/openvswitch/+bug/1976285/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list