[Bug 1701023] Re: (on trusty) version 1.9-3ubuntu10.4 regression blocking boot completion
Tom Verdaat
tom at verdaat.org
Thu Apr 26 14:35:06 UTC 2018
We've been doing a lot more testing and debugging and I'd like to share
our findings:
1) Unfortunately it turns out this change does not fix the issue of interfaces not coming up correctly for a bond with a (static) network configuration. The race condition seems to be removed so at least there are no more hangs between bonds and their vlan children. All the interfaces also say they are UP both when running ifup and after reboot. However:
- Running "ifup <slavename>" does bring up the bond (and its vlans) in a working state.
- Running "ifup -a" or rebooting don't actually work, causing "network not available" errors and "Destination Host Unreachable" when pinging other machines. Executing "ifdown -a; ifup -a" shows that ifupdown tries to bring up the bond BEFORE the slaves in stead of the other way around. Even though after the 60s timeout the bond and it's slaves say they are UP, they don't actually function.
- We're not seeing any issues with bonds that do not have a network configuration of their own
2) The networking script stack / concept seems fundamentally flawed in
three areas:
2.A) bonds relying on slaves having "bond-master" and being started by
bringing up the slaves, but not supporting the master having "bond-
slaves" and being able to start a bond by just bringing up the bond
directly.
2.B) bringing a specific interface up automatically brings up it's child
vlans. This does not make a lot of sense. The other way around does -
e.g. in order to bring up a vlan we need to bring up it's raw device -
but why would the ifupdown scripts assume that I want to bring up all of
it's vlans when I bring up an interface that (also) serves as a raw
device? In that case I would probably run "ifup -a"!
2.C) a vlan running on top of a bond cannot be brought up directly due to /sys/class/net/<bondname>/ not existing. This results in the following:
> # ifup bo-adm.2
> Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
> cat: /sys/class/net/bo-adm/mtu: No such file or directory
> Device "bo-adm" does not exist.
> bo-adm does not exist, unable to create bo-adm.2
> run-parts: /etc/network/if-pre-up.d/vlan exited with return code 1
> Failed to bring up bo-adm.2.
3) Our new workaround for boot has become this very intrusive systemd service:
> [Unit]
> Wants=network-online.target
> After=network-online.target
>
> [Install]
> WantedBy=multi-user.target
>
> [Service]
> Type=oneshot
> ExecStartPre=/sbin/ifdown bo-adm
> ExecStart=/sbin/ifup enp0s3
> ExecStart=/sbin/ifup enp0s10
> ExecStop=/sbin/ifdown bo-adm
> RemainAfterExit=yes
> TimeoutStartSec=5min
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ifupdown in Ubuntu.
https://bugs.launchpad.net/bugs/1701023
Title:
(on trusty) version 1.9-3ubuntu10.4 regression blocking boot
completion
Status in ifupdown package in Ubuntu:
In Progress
Status in vlan package in Ubuntu:
In Progress
Status in ifupdown source package in Trusty:
In Progress
Status in vlan source package in Trusty:
In Progress
Status in ifupdown source package in Xenial:
In Progress
Status in vlan source package in Xenial:
In Progress
Status in ifupdown source package in Artful:
In Progress
Status in vlan source package in Artful:
In Progress
Status in ifupdown source package in Bionic:
In Progress
Status in vlan source package in Bionic:
In Progress
Status in ifupdown package in Debian:
Fix Released
Status in vlan package in Debian:
New
Bug description:
When upgrading from version 1.9-3ubuntu10.1, a previously working
machine can't successfully reboot completely.
ifup is hanging indefinitely, with this process structure (from
"pstree -a 1299"):
ifup,1299 -a
└─run-parts,1501 /etc/network/if-pre-up.d
└─bridge,1502 /etc/network/if-pre-up.d/bridge
└─bridge,1508 /etc/network/if-pre-up.d/bridge
└─vlan,1511 /etc/network/if-pre-up.d/vlan
└─ifup,1532 eth0
<begin content of /etc/network/interfaces>
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 192.168.10.65
netmask 255.255.255.192
gateway 192.168.10.66
auto eth0.11
address 192.168.11.1
netmask 255.255.255.0
auto br1134
iface br1134 inet manual
bridge_ports eth0.1134
bridge_stp off
bridge_fd 0
<end content of /etc/network/interfaces>
The underlying interface eth0.1134 is not explicitly defined, but was
previously auto-created during "ifup -a" execution. This apparently
fails now.
Reverting back to the 10.1 version re-establishes old behavior.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1701023/+subscriptions
More information about the foundations-bugs
mailing list