ACK: [SRU][F:linux-bluefield][PATCH] cls_flower: Fix inability to match GRE/IPIP packets
Tim Gardner
tim.gardner at canonical.com
Fri May 20 12:12:29 UTC 2022
Acked-by: Tim Gardner <tim.gardner at canonical.com>
On 5/18/22 16:19, Bodong Wang wrote:
> From: Yoshiki Komachi <komachi.yoshiki at gmail.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1974096
>
> When a packet of a new flow arrives in openvswitch kernel module, it dissects
> the packet and passes the extracted flow key to ovs-vswtichd daemon. If hw-
> offload configuration is enabled, the daemon creates a new TC flower entry to
> bypass openvswitch kernel module for the flow (TC flower can also offload flows
> to NICs but this time that does not matter).
>
> In this processing flow, I found the following issue in cases of GRE/IPIP
> packets.
>
> When ovs_flow_key_extract() in openvswitch module parses a packet of a new
> GRE (or IPIP) flow received on non-tunneling vports, it extracts information
> of the outer IP header for ip_proto/src_ip/dst_ip match keys.
>
> This means ovs-vswitchd creates a TC flower entry with IP protocol/addresses
> match keys whose values are those of the outer IP header. OTOH, TC flower,
> which uses flow_dissector (different parser from openvswitch module), extracts
> information of the inner IP header.
>
> The following flow is an example to describe the issue in more detail.
>
> <----------- Outer IP -----------------> <---------- Inner IP ---------->
> +----------+--------------+--------------+----------+----------+----------+
> | ip_proto | src_ip | dst_ip | ip_proto | src_ip | dst_ip |
> | 47 (GRE) | 192.168.10.1 | 192.168.10.2 | 6 (TCP) | 10.0.0.1 | 10.0.0.2 |
> +----------+--------------+--------------+----------+----------+----------+
>
> In this case, TC flower entry and extracted information are shown as below:
>
> - ovs-vswitchd creates TC flower entry with:
> - ip_proto: 47
> - src_ip: 192.168.10.1
> - dst_ip: 192.168.10.2
>
> - TC flower extracts below for IP header matches:
> - ip_proto: 6
> - src_ip: 10.0.0.1
> - dst_ip: 10.0.0.2
>
> Thus, GRE or IPIP packets never match the TC flower entry, as each
> dissector behaves differently.
>
> IMHO, the behavior of TC flower (flow dissector) does not look correct,
> as ip_proto/src_ip/dst_ip in TC flower match means the outermost IP
> header information except for GRE/IPIP cases. This patch adds a new
> flow_dissector flag FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP which skips
> dissection of the encapsulated inner GRE/IPIP header in TC flower
> classifier.
>
> Signed-off-by: Yoshiki Komachi <komachi.yoshiki at gmail.com>
> Signed-off-by: David S. Miller <davem at davemloft.net>
> (backported from commit 6de6e46d27ef386feecdbea56b3bfd6c3b3bc1f9)
> Signed-off-by: Bodong Wang <bodong at nvidia.com>
> [bodong: resolve the conflict due to missing of skb_flow_dissect_hash]
> ---
> include/net/flow_dissector.h | 1 +
> net/core/flow_dissector.c | 15 +++++++++++++++
> net/sched/cls_flower.c | 3 ++-
> 3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
> index 4d7df77..d9f50b6 100644
> --- a/include/net/flow_dissector.h
> +++ b/include/net/flow_dissector.h
> @@ -261,6 +261,7 @@ enum flow_dissector_key_id {
> #define FLOW_DISSECTOR_F_PARSE_1ST_FRAG BIT(0)
> #define FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL BIT(1)
> #define FLOW_DISSECTOR_F_STOP_AT_ENCAP BIT(2)
> +#define FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP BIT(3)
>
> struct flow_dissector_key {
> enum flow_dissector_key_id key_id;
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index e6555e6..9001ddc 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -1264,6 +1264,11 @@ bool __skb_flow_dissect(const struct net *net,
>
> switch (ip_proto) {
> case IPPROTO_GRE:
> + if (flags & FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP) {
> + fdret = FLOW_DISSECT_RET_OUT_GOOD;
> + break;
> + }
> +
> fdret = __skb_flow_dissect_gre(skb, key_control, flow_dissector,
> target_container, data,
> &proto, &nhoff, &hlen, flags);
> @@ -1321,6 +1326,11 @@ bool __skb_flow_dissect(const struct net *net,
> break;
> }
> case IPPROTO_IPIP:
> + if (flags & FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP) {
> + fdret = FLOW_DISSECT_RET_OUT_GOOD;
> + break;
> + }
> +
> proto = htons(ETH_P_IP);
>
> key_control->flags |= FLOW_DIS_ENCAPSULATION;
> @@ -1333,6 +1343,11 @@ bool __skb_flow_dissect(const struct net *net,
> break;
>
> case IPPROTO_IPV6:
> + if (flags & FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP) {
> + fdret = FLOW_DISSECT_RET_OUT_GOOD;
> + break;
> + }
> +
> proto = htons(ETH_P_IPV6);
>
> key_control->flags |= FLOW_DIS_ENCAPSULATION;
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 5129b67..2ef5885 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -326,7 +326,8 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
> fl_ct_info_to_flower_map,
> ARRAY_SIZE(fl_ct_info_to_flower_map),
> post_ct, zone);
> - skb_flow_dissect(skb, &mask->dissector, &skb_key, 0);
> + skb_flow_dissect(skb, &mask->dissector, &skb_key,
> + FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP);
>
> fl_set_masked_key(&skb_mkey, &skb_key, mask);
>
--
-----------
Tim Gardner
Canonical, Inc
More information about the kernel-team
mailing list