netcfg fails on quad-nic

Spike spike at drba.org
Sat Mar 18 22:54:22 UTC 2017


Hi,

I'm not quite sure this is the place to discuss this, but I've tried with
no luck other irc channels, groups and whatnot and I couldn't find a better
place with server ppl, happy to get redirected and thank you for your
patience.

# Issue
On a supermicro server with a quadport nic, trying to preseed with
netcfg/choose_interface auto leads to a failure because netcfg selects the
wrong interface.

>From other bugs and the code itself, it seems that netcfg should cycle over
all interfaces, trying to bring them up and acquire an ip on it, stopping
with the first one that works. In my case, looking at the logs, I see no
trace of this "trying diff interfaces". (log attached). On xenial with
systemd's predictable interfaces the first selected nic is in reality the
3rd, or what would have been eth2, which is not the lan one that has
dhcp/pxe and for testing I've even completely unplugged. Nonetheless the
logs show netfcg bringing that up, running dhclient on it and failing. If I
drop into a shell I can correctly bring up the first interface with the
cable plugged in, acquire an ip and all, but when I go back to the d-i
screen and retry it's still using the 3rd interface and fails ago. At this
point I'm fundamentally stuck with no option to proceed the installation.

This is the syslog snippet showing the behavior:
Mar 18 21:14:58 netcfg[1402]: INFO: Taking down interface enp129s0f0
Mar 18 21:14:58 netcfg[1402]: INFO: Taking down interface enp129s0f3
Mar 18 21:14:58 netcfg[1402]: INFO: Taking down interface enp4s0f0
Mar 18 21:14:58 netcfg[1402]: INFO: Taking down interface enp4s0f1
Mar 18 21:14:58 netcfg[1402]: INFO: Taking down interface lo
Mar 18 21:14:58 netcfg[1402]: INFO: Activating interface enp129s0f0
Mar 18 21:14:58 netcfg[1402]: DEBUG: State is now 0
Mar 18 21:14:58 netcfg[1402]: DEBUG: Want link on enp129s0f0
Mar 18 21:14:58 kernel: [    7.351939] IPv6: ADDRCONF(NETDEV_UP):
enp129s0f0: link is not ready
Mar 18 21:14:58 netcfg[1402]: INFO: Waiting time set to 3
Mar 18 21:14:58 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:14:59 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:14:59 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:14:59 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:14:59 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:00 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:00 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:00 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:00 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:01 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:01 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:01 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:15:01 netcfg[1402]: INFO: Reached timeout for link detection on
enp129s0f0
Mar 18 21:15:01 netcfg[1402]: DEBUG: Commencing network autoconfiguration
on enp129s0f0
...
...
Mar 18 21:15:04 netcfg[1402]: WARNING **: Started DHCP client; PID is 1475
Mar 18 21:15:05 dhclient[1475]: DHCPDISCOVER on enp129s0f0 to
255.255.255.255 port 67 interval 1 (xid=0x7056351a)
Mar 18 21:15:06 dhclient[1475]: DHCPDISCOVER on enp129s0f0 to
255.255.255.255 port 67 interval 1 (xid=0x7056351a)
...
...
Mar 18 21:37:29 dhclient[1481]: No DHCPOFFERS received.
Mar 18 21:37:29 dhclient[1481]: No working leases in persistent database -
sleeping.
Mar 18 21:47:38 netcfg[1402]: DEBUG: State is now 3
Mar 18 21:47:42 netcfg[1402]: DEBUG: State is now 0
Mar 18 21:47:42 netcfg[1402]: DEBUG: Want link on enp129s0f0
Mar 18 21:47:42 netcfg[1402]: INFO: Waiting time set to 3
Mar 18 21:47:42 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
Mar 18 21:47:42 netcfg[1402]: INFO: ethtool-lite: enp129s0f0: carrier down
<AND ON AND ON>

The "activate" line is part of netcfg main function while iterating over
all interfaces and the "carrier down" are the ethtool-lite attempts to see
if the interface is up. What's puzzling and seems wrong is why, after
saying that indeed it "reached timeout for link detection" it would go
ahead and still trying to autoconfigure that interface.

These are the cards:
04:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)
81:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)
81:00.3 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)

the one on 04000 is the "first" one plugged in from which I pxeboot so it
works fine, see dhcp and all, it just becomes a problem during install with
netcfg.

These is the place in syslog where the names become predictable:
Mar 18 21:14:54 kernel: [    2.768606] igb 0000:81:00.0 enp129s0f0: renamed
from eth2
Mar 18 21:14:54 kernel: [    2.816294] igb 0000:04:00.0 enp4s0f0: renamed
from eth0
Mar 18 21:14:54 kernel: [    2.840224] igb 0000:81:00.3 enp129s0f3: renamed
from eth3
Mar 18 21:14:54 kernel: [    2.876220] igb 0000:04:00.1 enp4s0f1: renamed
from eth1

Notice again that enp4s0f0 is the "first card" and named eth0.

Further proof that this is a problem of retries with netcfg seems to be
that if I disable predictable names and enp4s0f0 is eth0 (the one
connected), this just work and installs proceeds as expected.

This is the boot command line from pxelinux:

        APPEND noprompt console nosplash auto=true priority=critical
netcfg/choose_interface=auto keyboard-configuration/layoutcode=us
console-setup/ask_detect=false url=http://pxe-srv/preseed/srv1604.seed
vga=788 netboot=nfs nfsroot=pxe-srv:/srv/isos/ubuntu/srv1604
initrd=ubuntu/srv1604/initrd.gz

thank you for any input or other pointer.

best,

Spike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-server/attachments/20170318/f345efb1/attachment.html>


More information about the ubuntu-server mailing list