[Bug 1056792] Re: Bonded network device is not correctly detected during boot-up.
annunaki2k2
russell at knighton.me.uk
Fri Sep 28 10:29:58 UTC 2012
As requested, an attached tarball file of the upstart logs.
** Attachment added: "Tarball og upstart logs."
https://bugs.launchpad.net/ubuntu/+source/ifenslave-2.6/+bug/1056792/+attachment/3351331/+files/upstart_logs_pm1.tar
** Changed in: ifenslave-2.6 (Ubuntu)
Status: Incomplete => New
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to ifenslave-2.6 in Ubuntu.
https://bugs.launchpad.net/bugs/1056792
Title:
Bonded network device is not correctly detected during boot-up.
Status in “ifenslave-2.6” package in Ubuntu:
New
Bug description:
We have an x86_64 Intel server running 12.04.1, and it is connected
using two on board 1G network in an LACP bond. The configuration works
fine, but for some very annoying reason, when the machine boots, the
start-up scripts hang for two minutes waiting for the connection to
come up - yet the connection is actually already up (and pingable
remotely).
Here is my interfaces configuration file:
russell at pm1 ~ $ cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
# Slave Definition for bond0
auto eth0
iface eth0 inet manual
bond-master bond0
auto eth1
iface eth1 inet manual
bond-master bond0
# The primary network interface
auto bond0
iface bond0 inet static
address 10.0.1.151
netmask 255.255.254.0
broadcast 10.0.1.255
network 10.0.0.0
gateway 10.0.0.1
dns-nameservers 10.0.0.120 10.0.1.120
dns-search mps.lan wilts.mps.lan
dns-domain mps.lan
bond-mode 802.3ad
bond-miimon 100
bond-lacp_rate 1
bond-slaves none
# bond-use_carrier 1
post-up /usr/local/sbin/check-bond.sh $IFACE
pre-down /usr/local/sbin/check-bond.sh stop $IFACE
And (once the machine times out and continues it's boot), here is the resultant configuration:
russell at pm1 ~ $ ifconfig
bond0 Link encap:Ethernet HWaddr 00:1e:67:44:58:88
inet addr:10.0.1.151 Bcast:10.0.1.255 Mask:255.255.254.0
inet6 addr: fe80::21e:67ff:fe44:5888/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:2644 errors:0 dropped:827 overruns:0 frame:0
TX packets:1575 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:282832 (282.8 KB) TX bytes:261199 (261.1 KB)
eth0 Link encap:Ethernet HWaddr 00:1e:67:44:58:88
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:803 errors:0 dropped:803 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:70241 (70.2 KB) TX bytes:992 (992.0 B)
Memory:d0b20000-d0b40000
eth1 Link encap:Ethernet HWaddr 00:1e:67:44:58:88
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:1841 errors:0 dropped:0 overruns:0 frame:0
TX packets:1567 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:212591 (212.5 KB) TX bytes:260207 (260.2 KB)
Memory:d0b00000-d0b20000
russell at pm1 ~ $ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 1
Actor Key: 17
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1e:67:44:58:88
Aggregator ID: 1
Slave queue ID: 0
Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1e:67:44:58:87
Aggregator ID: 2
Slave queue ID: 0
As you can see, it has actually booted with the correct configuration
- it just decided to waste two minutes because it failed to detect
correctly that the network is actually configured and ready.
Here are the relevant lines from the syslog relating to the bonding interface:
russell at pm1 ~ $ sudo cat /var/log/syslog | grep -i bond | grep kernel | grep "Sep 26 12:06"
Sep 26 12:06:38 pm1 kernel: [ 6.069287] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Sep 26 12:06:38 pm1 kernel: [ 6.077144] bonding: bond0: Setting MII monitoring interval to 100.
Sep 26 12:06:38 pm1 kernel: [ 6.084404] bonding: bond0: setting mode to 802.3ad (4).
Sep 26 12:06:38 pm1 kernel: [ 6.086176] bonding: bond0: Setting LACP rate to fast (1).
Sep 26 12:06:38 pm1 kernel: [ 6.088046] ADDRCONF(NETDEV_UP): bond0: link is not ready
Sep 26 12:06:38 pm1 kernel: [ 6.213700] bonding: bond0: Adding slave eth1.
Sep 26 12:06:38 pm1 kernel: [ 6.296412] bonding: bond0: enslaving eth1 as a backup interface with a down link.
Sep 26 12:06:38 pm1 kernel: [ 7.083578] bonding: bond0: link status definitely up for interface eth1, 1000 Mbps full duplex.
Sep 26 12:06:38 pm1 kernel: [ 7.084460] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
Sep 26 12:06:38 pm1 kernel: [ 7.270717] bonding: bond0: Adding slave eth0.
Sep 26 12:06:38 pm1 kernel: [ 7.354304] bonding: bond0: enslaving eth0 as a backup interface with an up link.
Sep 26 12:06:38 pm1 kernel: [ 7.594951] bonding: bond0: Setting MII monitoring interval to 100.
Sep 26 12:06:38 pm1 kernel: [ 7.595780] bonding: unable to update mode of bond0 because interface is up.
Sep 26 12:06:38 pm1 kernel: [ 7.596696] bonding: bond0: Unable to update LACP rate because interface is up.
Sep 26 12:06:46 pm1 kernel: [ 17.418840] bond0: no IPv6 routers present
It appears that the ifenslave script is trying to modify the bond network device after it is brought up - though it has already brought it up in the correct way before hand - perhaps this is the reason for the failed detection? The relevant lines are:
Sep 26 12:06:38 pm1 kernel: [ 7.595780] bonding: unable to update mode of bond0 because interface is up.
Sep 26 12:06:38 pm1 kernel: [ 7.596696] bonding: bond0: Unable to update LACP rate because interface is up.
And in fact, you see these lines on boot-up just before the big wait
happens (please see attached screen shot taken using the Remote
Management Module at boot time).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifenslave-2.6/+bug/1056792/+subscriptions
More information about the foundations-bugs
mailing list