[Bug 1759956] Re: [dvr][fast-exit] incorrect policy rules get deleted when a distributed router has ports on multiple tenant networks

Corey Bryant corey.bryant at canonical.com
Mon Apr 16 18:38:04 UTC 2018


** Description changed:

- TL;DR: ip -4 rule del priority <priority> table <table-id> type unicast
- will delete the first matching rule it encounters: if there are two
- rules with the same priority it will just kill the first one it finds.
+ Ubuntu SRU details
+ ------------------
+ [Impact]
+ See Original Description below.
+ 
+ [Test Case]
+ See Original Description below.
+ 
+ [Regression Potential]
+ Low. All patches have landed upstream in corresponding stable branches. 
+ 
+ Original Description
+ --------------------
+ TL;DR: ip -4 rule del priority <priority> table <table-id> type unicast will delete the first matching rule it encounters: if there are two rules with the same priority it will just kill the first one it finds.
  
  The original setup is described here:
  https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1759918
  
  OpenStack Queens from UCA (xenial, GA kernel, deployed via OpenStack
  charms), 2 external subnets (one routed provider network), 2 tenant
  subnets all in the same address scope to trigger "fast exit".
  
  2 tenant networks attached (subnets 192.168.100.0/24 and
  192.168.200.0/24) to a DVR:
  
  # 2 rules as expected
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.100.0/24 lookup 16 
- 80000:  from 192.168.200.0/24 lookup 16 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.100.0/24 lookup 16
+ 80000:  from 192.168.200.0/24 lookup 16
  
  # remove 192.168.200.0/24 sometimes deletes an incorrect policy rule
  openstack router remove subnet pubrouter othertenantsubnet
  
  # ip route del contains the cidr
  2018-03-29 20:09:52.946 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'ne
  tns', 'exec', 'fip-d0f008fc-dc45-4237-9ce0-a9e1977735eb', 'ip', '-4', 'route', 'del', '192.168.200.0/24', 'via', '169.254.93.94', 'dev', 'fpr-4f9ca9ef-3'
  ] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92
  
  # ip rule delete is not that specific
  2018-03-29 20:09:53.195 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'del', 'priority', '80000', 'table', '16', 'type', 'unicast'] create_pr
  ocess /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92
  
- 
  2018-03-29 20:15:59.210 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'show'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92
  2018-03-29 20:15:59.455 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'add', 'from', '192.168.100.0/24', 'priority', '80000', 'table', '16', 'type', 'unicast'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92
  
  ~~~~
  
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.100.0/24 lookup 16 
- 80000:  from 192.168.200.0/24 lookup 16 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.100.0/24 lookup 16
+ 80000:  from 192.168.200.0/24 lookup 16
  
  # try to delete a rule manually to see what is going on
  
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip -4 rule del priority 80000 table 16 type unicast ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.100.0/24 lookup 16 
- 80000:  from 192.168.200.0/24 lookup 16 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.100.0/24 lookup 16
+ 80000:  from 192.168.200.0/24 lookup 16
  
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.200.0/24 lookup 16 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.200.0/24 lookup 16
  
  # ^^ 192.168.100.0/24 rule got deleted instead of 192.168.200.0/24
  
  # add the rule back manually
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule add from 192.168.100.0/24 priority 80000 table 16 type unicast
  
  # different order now - 192.168.200.0/24 is first
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.200.0/24 lookup 16 
- 80000:  from 192.168.100.0/24 lookup 16 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.200.0/24 lookup 16
+ 80000:  from 192.168.100.0/24 lookup 16
  
  # now 192.168.200.0/24 got deleted because it was first to match
  
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip -4 rule del priority 80000 table 16 type unicast ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.200.0/24 lookup 16 
- 80000:  from 192.168.100.0/24 lookup 16 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.200.0/24 lookup 16
+ 80000:  from 192.168.100.0/24 lookup 16
  
- 0:      from all lookup local 
- 32766:  from all lookup main 
- 32767:  from all lookup default 
- 80000:  from 192.168.100.0/24 lookup 16 
- 
+ 0:      from all lookup local
+ 32766:  from all lookup main
+ 32767:  from all lookup default
+ 80000:  from 192.168.100.0/24 lookup 16
  
  Code:
  
  _dvr_internal_network_removed
  https://github.com/openstack/neutron/blob/stable/queens/neutron/agent/l3/dvr_local_router.py#L431-L443
  
  _delete_interface_routing_rule_in_router_ns
  https://github.com/openstack/neutron/blob/stable/queens/neutron/agent/l3/dvr_local_router.py#L642-L648
-         ip_rule = ip_lib.IPRule(namespace=self.ns_name)
-         for subnet in router_port['subnets']:
-             rtr_port_cidr = subnet['cidr']
-             ip_rule.rule.delete(ip=rtr_port_cidr,
-                                 table=dvr_fip_ns.FIP_RT_TBL,
-                                 priority=dvr_fip_ns.FAST_PATH_EXIT_PR)
+         ip_rule = ip_lib.IPRule(namespace=self.ns_name)
+         for subnet in router_port['subnets']:
+             rtr_port_cidr = subnet['cidr']
+             ip_rule.rule.delete(ip=rtr_port_cidr,
+                                 table=dvr_fip_ns.FIP_RT_TBL,
+                                 priority=dvr_fip_ns.FAST_PATH_EXIT_PR)
  
  IpRuleCommand
  https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_lib.py#L486-L494
  
-         # TODO(Carl) ip ignored in delete, okay in general?
+         # TODO(Carl) ip ignored in delete, okay in general?
  
  He-he, experience shows that definitely not.
  
  We need to use the most specific rule description to avoid ordering
  issues.
  
  ip -4 rule del from 192.168.200.0/24 priority 80000 table 16 type
  unicast
  
  With a fix it looks like this:
  
  2018-03-29 20:58:57.023 192084 DEBUG neutron.agent.linux.utils [-]
  Running command: ['sudo', 'neutron-rootwrap',
  '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-
  303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'del', 'from',
  '192.168.200.0/24', 'priority', '80000', 'table', '16', 'type',
  'unicast'] create_process /usr/lib/python2.7/dist-
  packages/neutron/agent/linux/utils.py:92

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1759956

Title:
  [dvr][fast-exit] incorrect policy rules get deleted when a distributed
  router has ports on multiple tenant networks

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive pike series:
  Triaged
Status in Ubuntu Cloud Archive queens series:
  Triaged
Status in neutron:
  Fix Released
Status in neutron package in Ubuntu:
  Triaged
Status in neutron source package in Artful:
  Triaged
Status in neutron source package in Bionic:
  Triaged

Bug description:
  Ubuntu SRU details
  ------------------
  [Impact]
  See Original Description below.

  [Test Case]
  See Original Description below.

  [Regression Potential]
  Low. All patches have landed upstream in corresponding stable branches. 

  Original Description
  --------------------
  TL;DR: ip -4 rule del priority <priority> table <table-id> type unicast will delete the first matching rule it encounters: if there are two rules with the same priority it will just kill the first one it finds.

  The original setup is described here:
  https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1759918

  OpenStack Queens from UCA (xenial, GA kernel, deployed via OpenStack
  charms), 2 external subnets (one routed provider network), 2 tenant
  subnets all in the same address scope to trigger "fast exit".

  2 tenant networks attached (subnets 192.168.100.0/24 and
  192.168.200.0/24) to a DVR:

  # 2 rules as expected
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.100.0/24 lookup 16
  80000:  from 192.168.200.0/24 lookup 16

  # remove 192.168.200.0/24 sometimes deletes an incorrect policy rule
  openstack router remove subnet pubrouter othertenantsubnet

  # ip route del contains the cidr
  2018-03-29 20:09:52.946 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'ne
  tns', 'exec', 'fip-d0f008fc-dc45-4237-9ce0-a9e1977735eb', 'ip', '-4', 'route', 'del', '192.168.200.0/24', 'via', '169.254.93.94', 'dev', 'fpr-4f9ca9ef-3'
  ] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92

  # ip rule delete is not that specific
  2018-03-29 20:09:53.195 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'del', 'priority', '80000', 'table', '16', 'type', 'unicast'] create_pr
  ocess /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92

  2018-03-29 20:15:59.210 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'show'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92
  2018-03-29 20:15:59.455 2083594 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'add', 'from', '192.168.100.0/24', 'priority', '80000', 'table', '16', 'type', 'unicast'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:92

  ~~~~

  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.100.0/24 lookup 16
  80000:  from 192.168.200.0/24 lookup 16

  # try to delete a rule manually to see what is going on

  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip -4 rule del priority 80000 table 16 type unicast ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.100.0/24 lookup 16
  80000:  from 192.168.200.0/24 lookup 16

  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.200.0/24 lookup 16

  # ^^ 192.168.100.0/24 rule got deleted instead of 192.168.200.0/24

  # add the rule back manually
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule add from 192.168.100.0/24 priority 80000 table 16 type unicast

  # different order now - 192.168.200.0/24 is first
  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.200.0/24 lookup 16
  80000:  from 192.168.100.0/24 lookup 16

  # now 192.168.200.0/24 got deleted because it was first to match

  ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip -4 rule del priority 80000 table 16 type unicast ; ip netns exec qrouter-4f9ca9ef-303b-4082-abbc-e50782d9b800 ip rule
  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.200.0/24 lookup 16
  80000:  from 192.168.100.0/24 lookup 16

  0:      from all lookup local
  32766:  from all lookup main
  32767:  from all lookup default
  80000:  from 192.168.100.0/24 lookup 16

  Code:

  _dvr_internal_network_removed
  https://github.com/openstack/neutron/blob/stable/queens/neutron/agent/l3/dvr_local_router.py#L431-L443

  _delete_interface_routing_rule_in_router_ns
  https://github.com/openstack/neutron/blob/stable/queens/neutron/agent/l3/dvr_local_router.py#L642-L648
          ip_rule = ip_lib.IPRule(namespace=self.ns_name)
          for subnet in router_port['subnets']:
              rtr_port_cidr = subnet['cidr']
              ip_rule.rule.delete(ip=rtr_port_cidr,
                                  table=dvr_fip_ns.FIP_RT_TBL,
                                  priority=dvr_fip_ns.FAST_PATH_EXIT_PR)

  IpRuleCommand
  https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_lib.py#L486-L494

          # TODO(Carl) ip ignored in delete, okay in general?

  He-he, experience shows that definitely not.

  We need to use the most specific rule description to avoid ordering
  issues.

  ip -4 rule del from 192.168.200.0/24 priority 80000 table 16 type
  unicast

  With a fix it looks like this:

  2018-03-29 20:58:57.023 192084 DEBUG neutron.agent.linux.utils [-]
  Running command: ['sudo', 'neutron-rootwrap',
  '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-
  4f9ca9ef-303b-4082-abbc-e50782d9b800', 'ip', '-4', 'rule', 'del',
  'from', '192.168.200.0/24', 'priority', '80000', 'table', '16',
  'type', 'unicast'] create_process /usr/lib/python2.7/dist-
  packages/neutron/agent/linux/utils.py:92

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1759956/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list