[Bug 1966981] Re: [hwol] centralized gateway NAT mangles source address

Frode Nordahl 1966981 at bugs.launchpad.net
Thu Mar 31 05:52:39 UTC 2022


** Description changed:

  When hardware offload is enabled, CMS configures gateway routers and
  centralized NAT, the source address of packets flowing through the
  gateway are mangled after the connection preamble, not allowing the
- connection to form.
+ connection to form:
+ 
+ tcpdump: listening on tape5c1862d-b4, link-type EN10MB (Ethernet), capture size 262144 bytes
+ 05:52:04.286472 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 8865, offset 0, flags [DF], proto TCP (6), length 60)
+     10.11.2.11.57704 > 10.42.3.34.22: Flags [S], cksum 0x1e2a (correct), seq 3991821787, win 64240, options [mss 1460,sackOK,TS val 943370240 ecr 0,nop,wscale 7], length 0
+ 05:52:04.286661 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
+     10.42.3.34.22 > 10.11.2.11.57704: Flags [S.], cksum 0x1990 (incorrect -> 0xdac4), seq 2903091535, ack 3991821788, win 62230, options [mss 8902,sackOK,TS val 2422032229 ecr 943370240,nop,wscale 7], length 0
+ 05:52:04.287314 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 62, id 8866, offset 0, flags [DF], proto TCP (6), length 52)
+     10.11.2.11.57704 > 10.42.3.34.22: Flags [.], cksum 0x17c3 (correct), seq 1, ack 1, win 502, options [nop,nop,TS val 943370241 ecr 2422032229], length 0
+ 05:52:04.287589 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 107: (tos 0x0, ttl 62, id 8867, offset 0, flags [DF], proto TCP (6), length 93)
+     0.0.0.0.57704 > 10.42.3.34.22: Flags [P.], cksum 0x017f (correct), seq 3991821788:3991821829, ack 2903091536, win 502, options [nop,nop,TS val 943370242 ecr 2422032229], length 41
+ 
  
  When disabling hardware offload completely the problem goes away.
  
  Running ovn-northd without [0] also makes the problem go away.
  
  0: https://github.com/ovn-
  org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a

** Description changed:

  When hardware offload is enabled, CMS configures gateway routers and
  centralized NAT, the source address of packets flowing through the
  gateway are mangled after the connection preamble, not allowing the
  connection to form:
  
  tcpdump: listening on tape5c1862d-b4, link-type EN10MB (Ethernet), capture size 262144 bytes
  05:52:04.286472 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 8865, offset 0, flags [DF], proto TCP (6), length 60)
-     10.11.2.11.57704 > 10.42.3.34.22: Flags [S], cksum 0x1e2a (correct), seq 3991821787, win 64240, options [mss 1460,sackOK,TS val 943370240 ecr 0,nop,wscale 7], length 0
+     10.11.2.11.57704 > 10.42.3.34.22: Flags [S], cksum 0x1e2a (correct), seq 3991821787, win 64240, options [mss 1460,sackOK,TS val 943370240 ecr 0,nop,wscale 7], length 0
  05:52:04.286661 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
-     10.42.3.34.22 > 10.11.2.11.57704: Flags [S.], cksum 0x1990 (incorrect -> 0xdac4), seq 2903091535, ack 3991821788, win 62230, options [mss 8902,sackOK,TS val 2422032229 ecr 943370240,nop,wscale 7], length 0
+     10.42.3.34.22 > 10.11.2.11.57704: Flags [S.], cksum 0x1990 (incorrect -> 0xdac4), seq 2903091535, ack 3991821788, win 62230, options [mss 8902,sackOK,TS val 2422032229 ecr 943370240,nop,wscale 7], length 0
  05:52:04.287314 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 62, id 8866, offset 0, flags [DF], proto TCP (6), length 52)
-     10.11.2.11.57704 > 10.42.3.34.22: Flags [.], cksum 0x17c3 (correct), seq 1, ack 1, win 502, options [nop,nop,TS val 943370241 ecr 2422032229], length 0
+     10.11.2.11.57704 > 10.42.3.34.22: Flags [.], cksum 0x17c3 (correct), seq 1, ack 1, win 502, options [nop,nop,TS val 943370241 ecr 2422032229], length 0
  05:52:04.287589 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 107: (tos 0x0, ttl 62, id 8867, offset 0, flags [DF], proto TCP (6), length 93)
-     0.0.0.0.57704 > 10.42.3.34.22: Flags [P.], cksum 0x017f (correct), seq 3991821788:3991821829, ack 2903091536, win 502, options [nop,nop,TS val 943370242 ecr 2422032229], length 41
+     0.0.0.0.57704 > 10.42.3.34.22: Flags [P.], cksum 0x017f (correct), seq 3991821788:3991821829, ack 2903091536, win 502, options [nop,nop,TS val 943370242 ecr 2422032229], length 41
  
+ Enabling distributed floating IP (aka DVR) the problem goes away.
  
  When disabling hardware offload completely the problem goes away.
  
  Running ovn-northd without [0] also makes the problem go away.
  
  0: https://github.com/ovn-
  org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a

** Description changed:

  When hardware offload is enabled, CMS configures gateway routers and
  centralized NAT, the source address of packets flowing through the
  gateway are mangled after the connection preamble, not allowing the
- connection to form:
+ connection to form.
  
+ Example outside-in TCP connection attempt:
  tcpdump: listening on tape5c1862d-b4, link-type EN10MB (Ethernet), capture size 262144 bytes
  05:52:04.286472 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 8865, offset 0, flags [DF], proto TCP (6), length 60)
      10.11.2.11.57704 > 10.42.3.34.22: Flags [S], cksum 0x1e2a (correct), seq 3991821787, win 64240, options [mss 1460,sackOK,TS val 943370240 ecr 0,nop,wscale 7], length 0
  05:52:04.286661 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
      10.42.3.34.22 > 10.11.2.11.57704: Flags [S.], cksum 0x1990 (incorrect -> 0xdac4), seq 2903091535, ack 3991821788, win 62230, options [mss 8902,sackOK,TS val 2422032229 ecr 943370240,nop,wscale 7], length 0
  05:52:04.287314 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 62, id 8866, offset 0, flags [DF], proto TCP (6), length 52)
      10.11.2.11.57704 > 10.42.3.34.22: Flags [.], cksum 0x17c3 (correct), seq 1, ack 1, win 502, options [nop,nop,TS val 943370241 ecr 2422032229], length 0
  05:52:04.287589 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 107: (tos 0x0, ttl 62, id 8867, offset 0, flags [DF], proto TCP (6), length 93)
      0.0.0.0.57704 > 10.42.3.34.22: Flags [P.], cksum 0x017f (correct), seq 3991821788:3991821829, ack 2903091536, win 502, options [nop,nop,TS val 943370242 ecr 2422032229], length 41
  
+ Return traffic on connection-less UDP streams are also affected:
+ 05:51:56.441418 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 104: (tos 0x0, ttl 64, id 56915, offset 0, flags [DF], proto UDP (17), length 90)
+     10.42.3.34.57502 > 194.169.254.1.53: [bad udp cksum 0xce4e -> 0x88d1!] 38125+ AAAA? fnordahl-bastion.openstack.partnercloud1.lan. (62)
+ 05:51:56.443387 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 179: (tos 0x0, ttl 63, id 64535, offset 0, flags [DF], proto UDP (17), length 165)
+     0.0.0.0.53 > 10.42.3.34.57502: [udp sum ok] 38125 NXDomain q: AAAA? fnordahl-bastion.openstack.partnercloud1.lan. 0/1/0 ns: . SOA a.root-servers.net. nstld.verisign-grs.com. 2022033100 1800 900 604800 86400 (137)
+ 
+ 
  Enabling distributed floating IP (aka DVR) the problem goes away.
  
  When disabling hardware offload completely the problem goes away.
  
  Running ovn-northd without [0] also makes the problem go away.
  
  0: https://github.com/ovn-
  org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a

** Description changed:

  When hardware offload is enabled, CMS configures gateway routers and
  centralized NAT, the source address of packets flowing through the
  gateway are mangled after the connection preamble, not allowing the
- connection to form.
+ connection to form. Return traffic on connections established from the
+ instance are also affected, even for UDP (DNS).
  
  Example outside-in TCP connection attempt:
  tcpdump: listening on tape5c1862d-b4, link-type EN10MB (Ethernet), capture size 262144 bytes
  05:52:04.286472 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 8865, offset 0, flags [DF], proto TCP (6), length 60)
      10.11.2.11.57704 > 10.42.3.34.22: Flags [S], cksum 0x1e2a (correct), seq 3991821787, win 64240, options [mss 1460,sackOK,TS val 943370240 ecr 0,nop,wscale 7], length 0
  05:52:04.286661 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
      10.42.3.34.22 > 10.11.2.11.57704: Flags [S.], cksum 0x1990 (incorrect -> 0xdac4), seq 2903091535, ack 3991821788, win 62230, options [mss 8902,sackOK,TS val 2422032229 ecr 943370240,nop,wscale 7], length 0
  05:52:04.287314 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 62, id 8866, offset 0, flags [DF], proto TCP (6), length 52)
      10.11.2.11.57704 > 10.42.3.34.22: Flags [.], cksum 0x17c3 (correct), seq 1, ack 1, win 502, options [nop,nop,TS val 943370241 ecr 2422032229], length 0
  05:52:04.287589 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 107: (tos 0x0, ttl 62, id 8867, offset 0, flags [DF], proto TCP (6), length 93)
      0.0.0.0.57704 > 10.42.3.34.22: Flags [P.], cksum 0x017f (correct), seq 3991821788:3991821829, ack 2903091536, win 502, options [nop,nop,TS val 943370242 ecr 2422032229], length 41
  
  Return traffic on connection-less UDP streams are also affected:
  05:51:56.441418 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 104: (tos 0x0, ttl 64, id 56915, offset 0, flags [DF], proto UDP (17), length 90)
-     10.42.3.34.57502 > 194.169.254.1.53: [bad udp cksum 0xce4e -> 0x88d1!] 38125+ AAAA? fnordahl-bastion.openstack.partnercloud1.lan. (62)
+     10.42.3.34.57502 > 194.169.254.1.53: [bad udp cksum 0xce4e -> 0x88d1!] 38125+ AAAA? fnordahl-bastion.openstack.partnercloud1.lan. (62)
  05:51:56.443387 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 179: (tos 0x0, ttl 63, id 64535, offset 0, flags [DF], proto UDP (17), length 165)
-     0.0.0.0.53 > 10.42.3.34.57502: [udp sum ok] 38125 NXDomain q: AAAA? fnordahl-bastion.openstack.partnercloud1.lan. 0/1/0 ns: . SOA a.root-servers.net. nstld.verisign-grs.com. 2022033100 1800 900 604800 86400 (137)
- 
+     0.0.0.0.53 > 10.42.3.34.57502: [udp sum ok] 38125 NXDomain q: AAAA? fnordahl-bastion.openstack.partnercloud1.lan. 0/1/0 ns: . SOA a.root-servers.net. nstld.verisign-grs.com. 2022033100 1800 900 604800 86400 (137)
  
  Enabling distributed floating IP (aka DVR) the problem goes away.
  
  When disabling hardware offload completely the problem goes away.
  
  Running ovn-northd without [0] also makes the problem go away.
  
  0: https://github.com/ovn-
  org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to ovn in Ubuntu.
https://bugs.launchpad.net/bugs/1966981

Title:
  [hwol] centralized gateway NAT mangles source address

Status in ovn package in Ubuntu:
  Triaged

Bug description:
  When hardware offload is enabled, CMS configures gateway routers and
  centralized NAT, the source address of packets flowing through the
  gateway are mangled after the connection preamble, not allowing the
  connection to form. Return traffic on connections established from the
  instance are also affected, even for UDP (DNS).

  Example outside-in TCP connection attempt:
  tcpdump: listening on tape5c1862d-b4, link-type EN10MB (Ethernet), capture size 262144 bytes
  05:52:04.286472 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 62, id 8865, offset 0, flags [DF], proto TCP (6), length 60)
      10.11.2.11.57704 > 10.42.3.34.22: Flags [S], cksum 0x1e2a (correct), seq 3991821787, win 64240, options [mss 1460,sackOK,TS val 943370240 ecr 0,nop,wscale 7], length 0
  05:52:04.286661 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
      10.42.3.34.22 > 10.11.2.11.57704: Flags [S.], cksum 0x1990 (incorrect -> 0xdac4), seq 2903091535, ack 3991821788, win 62230, options [mss 8902,sackOK,TS val 2422032229 ecr 943370240,nop,wscale 7], length 0
  05:52:04.287314 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 62, id 8866, offset 0, flags [DF], proto TCP (6), length 52)
      10.11.2.11.57704 > 10.42.3.34.22: Flags [.], cksum 0x17c3 (correct), seq 1, ack 1, win 502, options [nop,nop,TS val 943370241 ecr 2422032229], length 0
  05:52:04.287589 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 107: (tos 0x0, ttl 62, id 8867, offset 0, flags [DF], proto TCP (6), length 93)
      0.0.0.0.57704 > 10.42.3.34.22: Flags [P.], cksum 0x017f (correct), seq 3991821788:3991821829, ack 2903091536, win 502, options [nop,nop,TS val 943370242 ecr 2422032229], length 41

  Return traffic on connection-less UDP streams are also affected:
  05:51:56.441418 fa:16:3e:fc:82:be > fa:16:3e:c8:19:af, ethertype IPv4 (0x0800), length 104: (tos 0x0, ttl 64, id 56915, offset 0, flags [DF], proto UDP (17), length 90)
      10.42.3.34.57502 > 194.169.254.1.53: [bad udp cksum 0xce4e -> 0x88d1!] 38125+ AAAA? fnordahl-bastion.openstack.partnercloud1.lan. (62)
  05:51:56.443387 fa:16:3e:c8:19:af > fa:16:3e:fc:82:be, ethertype IPv4 (0x0800), length 179: (tos 0x0, ttl 63, id 64535, offset 0, flags [DF], proto UDP (17), length 165)
      0.0.0.0.53 > 10.42.3.34.57502: [udp sum ok] 38125 NXDomain q: AAAA? fnordahl-bastion.openstack.partnercloud1.lan. 0/1/0 ns: . SOA a.root-servers.net. nstld.verisign-grs.com. 2022033100 1800 900 604800 86400 (137)

  Enabling distributed floating IP (aka DVR) the problem goes away.

  When disabling hardware offload completely the problem goes away.

  Running ovn-northd without [0] also makes the problem go away.

  Versions:
  OVS 2.17.0
  OVN 22.03.0
  Kernel 5.15.0-23-generic

  Hardware: ConnectX-6 Dx MCX623106AN-CDAT
  Firmware: 22.32.2004
  Driver: OFED 5.5-1.0.3.2

  0: https://github.com/ovn-
  org/ovn/commit/4deac4509abbedd6ffaecf27eed01ddefccea40a

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1966981/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list