[Bug 1976285] Re: SNAT/DNAT - Traffic sent to LRP port recirculate until TTL=0 (drop recirc action)

Bug Watch Updater 1976285 at bugs.launchpad.net
Mon May 30 15:57:08 UTC 2022


** Changed in: openvswitch
       Status: Unknown => New

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1976285

Title:
  SNAT/DNAT - Traffic sent to LRP port recirculate until TTL=0 (drop
  recirc action)

Status in openvswitch:
  New
Status in ovn package in Ubuntu:
  New

Bug description:
  Hey there,

  I'm looking through the docs quite extensively for references on how
  SNAT and DNAT flow work to try to understand the problem related to
  the issues reported in the links below:

  https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967718
  https://mail.openvswitch.org/pipermail/ovs-dev/2021-August/386720.html

  I can see these same log messages "kernel: openvswitch: ovs-system:
  deferred action limit reached, drop recirc action" on gateway nodes in
  my OpenStack installation.

  The main problem is related to TCP/UDP traffic sent to the address of
  an LRP port that is not part of any SNAT/DNAT conversation, it will
  keep recirculating in the OVS data plane until TTL is 0.

  The message is shown in the kernel log due to the size of the FIFO
  "DEFERRED_ACTION_FIFO_SIZE", but this is a consequence of the packets
  not matching the flow tables of the datapath. See kernel -
  net/openvswitch/actions.c

  I can reproduce on a local ovn/ovs installation building the ovn
  main/master branch and ovs submodule(github projetcs). This problem
  also occurs in all the latest released tags from OVN and OVS for
  Ubuntu 20.04 LTS.

  Basically, it only happens when there is a SNAT rule to translate an
  entire network (masquerade) and the return traffic does not have an
  open port. If a DNAT is used for a specific host (even if the ports
  have not been mapped, but if there is a 'host' to redirect the DNAT),
  the traffic is forwarded and is not sent via netlink through the
  slowpath until it is dropped.

  The patch proposed by Krzysztof Klimonda aims to modify the flow table
  via OVN communication - inserting a drp rule for traffic related to
  this issue. This patch was not accepted in the project, but it made me
  intrigued as to how to solve this problem (I can't just increase the
  kernel DEFERRED_ACTION_FIFO_SIZE). The proposed patch is very old and
  does not apply to the current code structure. I tried to adapt ovn-
  northd.c to the new northd/northd.c format and applied it to upstream,
  but the problem still occurs.
  ovn_upstream.txt[https://github.com/openvswitch/ovs-
  issues/files/8798982/ovn_upstream.txt]

  I believe the patch does not solve the problem because I keep seeing
  messages in the log.

  Do you have any ideas on how to solve this problem?

  I am adding a reproducer for this issue in the attached file.
  issue_reproducer.txt[https://github.com/openvswitch/ovs-issues/files/8798161/issue_reproducer.txt]

  
  Kind regards,
  Roberto

To manage notifications about this bug go to:
https://bugs.launchpad.net/openvswitch/+bug/1976285/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list