[Bug 1927868] Re: vRouter not working after update to 16.3.1

Edward Hope-Morley 1927868 at bugs.launchpad.net
Fri Jun 25 11:48:43 UTC 2021


I have just re-tested all of this as follows:

 * deployed Openstack Train (on Bionic i.e. 2:15.3.3-0ubuntu1~cloud0) with 3 gateway nodes
 * created one HA router, one vm with one fip
 * can ping fip and confirm single active router
 * upgraded neutron-server (api) to 16.3.0-0ubuntu3~cloud0 (ussuri), stopped server, neutron-db-manage upgrade head, start server
 * ping still works
 * upgraded all compute hosts to 16.3.0-0ubuntu3~cloud0, observed vrrp failover and short interruption
 * ping still works
 * upgraded one compute to 2:16.3.2-0ubuntu3~cloud0
 * ping still works
 * upgraded neutron-server (api) to 2:16.3.2-0ubuntu3~cloud0, stopped server, neutron-db-manage upgrade head (observed no migrations), start server
 * ping still works
 * upgraded remaining compute to 2:16.3.2-0ubuntu3~cloud0
 * ping still works

I noticed that after upgrading to 2:16.3.2-0ubuntu3~cloud0 my interfaces
when from:

root at juju-f0dfb3-lp1927868-6:~# ip netns exec qrouter-8b5e4130-6688-45c5-bc8e-ee3781d8719c ip a s; pgrep -alf keepalived| grep -v state  
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000                                              
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00                                                                                
    inet 127.0.0.1/8 scope host lo                                                                                                       
       valid_lft forever preferred_lft forever                                                                                           
    inet6 ::1/128 scope host                                                                                                             
       valid_lft forever preferred_lft forever                                                                                           
2: ha-bd1bd9ab-f8 at if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000                        
    link/ether fa:16:3e:6a:ae:8c brd ff:ff:ff:ff:ff:ff link-netnsid 0                                                                    
    inet 169.254.195.91/18 brd 169.254.255.255 scope global ha-bd1bd9ab-f8                                                               
       valid_lft forever preferred_lft forever                                                                                           
    inet6 fe80::f816:3eff:fe6a:ae8c/64 scope link                                                                                        
       valid_lft forever preferred_lft forever                                                                                           
3: qg-9e134c20-1f at if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000                        
    link/ether fa:16:3e:c4:cc:84 brd ff:ff:ff:ff:ff:ff link-netnsid 0
4: qr-a125b622-2d at if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:0b:d3:74 brd ff:ff:ff:ff:ff:ff link-netnsid 0                                                                    

to:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000                                              
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00                                                                                
    inet 127.0.0.1/8 scope host lo                                   
       valid_lft forever preferred_lft forever                       
    inet6 ::1/128 scope host                                                                                                             
       valid_lft forever preferred_lft forever
2: ha-bd1bd9ab-f8 at if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:6a:ae:8c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 169.254.195.91/18 brd 169.254.255.255 scope global ha-bd1bd9ab-f8
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe6a:ae8c/64 scope link 
       valid_lft forever preferred_lft forever
3: qg-9e134c20-1f at if13: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether fa:16:3e:c4:cc:84 brd ff:ff:ff:ff:ff:ff link-netnsid 0
4: qr-a125b622-2d at if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:0b:d3:74 brd ff:ff:ff:ff:ff:ff link-netnsid 0

And it remained like that until the router went vrrp master:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00                                                                                
    inet 127.0.0.1/8 scope host lo                                                                                                       
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever                                                                                           
2: ha-bd1bd9ab-f8 at if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:6a:ae:8c brd ff:ff:ff:ff:ff:ff link-netnsid 0                                                                    
    inet 169.254.195.91/18 brd 169.254.255.255 scope global ha-bd1bd9ab-f8                                                               
       valid_lft forever preferred_lft forever
    inet 169.254.0.12/24 scope global ha-bd1bd9ab-f8
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe6a:ae8c/64 scope link                                                                                        
       valid_lft forever preferred_lft forever                                                                                           
3: qg-9e134c20-1f at if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:c4:cc:84 brd ff:ff:ff:ff:ff:ff link-netnsid 0                                                                    
    inet 10.5.152.110/32 scope global qg-9e134c20-1f
       valid_lft forever preferred_lft forever
    inet 10.5.153.178/16 scope global qg-9e134c20-1f                                                                                     
       valid_lft forever preferred_lft forever                                                                                           
    inet6 fe80::f816:3eff:fec4:cc84/64 scope link nodad                                                                                  
       valid_lft forever preferred_lft forever                                                                                           
4: qr-a125b622-2d at if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:0b:d3:74 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.21.1/24 scope global qr-a125b622-2d                                                                                     
       valid_lft forever preferred_lft forever                                                                                           
    inet6 fe80::f816:3eff:fe0b:d374/64 scope link nodad  
       valid_lft forever preferred_lft forever

So based on this test I do not see any issues with upgrading to
2:16.3.2-0ubuntu3~cloud0 and it is clear that failing to upgrade in the
correct order, and perform db migrations with a stopped neutron-server
when coming from Train, causes issues. If anyone has any more
information that can help prove the contrary please share.

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to neutron in Ubuntu.
https://bugs.launchpad.net/bugs/1927868

Title:
  vRouter not working after update to 16.3.1

Status in neutron:
  New
Status in neutron package in Ubuntu:
  New

Bug description:
  We run a juju managed Openstack Ussuri on Bionic. After updating
  neutron packages from 16.3.0 to 16.3.1 all virtual routers stopped
  working. It seems that most (not all) namespaces are created but have
  only the lo interface and sometime the ha-XYZ interface in DOWN state.
  The underlying tap interfaces are also in down.

  neutron-l3-agent has many logs similar to the following:
  2021-05-08 15:01:45.286 39411 ERROR neutron.agent.l3.ha_router [-] Gateway interface for router 02945b59-639b-41be-8237-3b7933b4e32d was not set up; router will not work properly

  and journal logs report at around the same time
  May 08 15:01:40 lar1615.srv-louros.grnet.gr neutron-keepalived-state-change[18596]: 2021-05-08 15:01:40.765 18596 INFO neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 62.62.62.62 on qg-5a6efe8c-6b in namespace qrouter-02945b59-639b-41be-8237-3b7933b4e32d: Exit code: 2; Stdin: ; Stdout: Interface "qg-5a6efe8c-6b" is down
  May 08 15:01:40 lar1615.srv-louros.grnet.gr neutron-keepalived-state-change[18596]: 2021-05-08 15:01:40.767 18596 INFO neutron.agent.linux.ip_lib [-] Interface qg-5a6efe8c-6b or address 62.62.62.62 in namespace qrouter-02945b59-639b-41be-8237-3b7933b4e32d was deleted concurrently

  
  The neutron packages installed are:

  ii  neutron-common                         2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - common
  ii  neutron-dhcp-agent                     2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - DHCP agent
  ii  neutron-l3-agent                       2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - l3 agent
  ii  neutron-metadata-agent                 2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - metadata agent
  ii  neutron-metering-agent                 2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - metering agent
  ii  neutron-openvswitch-agent              2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - Open vSwitch plugin agent
  ii  python3-neutron                        2:16.3.1-0ubuntu1~cloud0                                    all          Neutron is a virtual network service for Openstack - Python library
  ii  python3-neutron-lib                    2.3.0-0ubuntu1~cloud0                                       all          Neutron shared routines and utilities - Python 3.x
  ii  python3-neutronclient                  1:7.1.1-0ubuntu1~cloud0                                     all          client API library for Neutron - Python 3.x


  Downgrading to 16.3.0 resolves the issues.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1927868/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list