[Bug 1598299] Re: Ubuntu14.04.05 netboot installation fails with timeout errors due to ignoring ARP update requests
bugproxy
bugproxy at us.ibm.com
Fri Sep 2 17:30:51 UTC 2016
------- Comment From ruddk at us.ibm.com 2016-09-02 13:26 EDT-------
(In reply to comment #20)
> Unless this is a regression in 14.04.5 vs. previous point releases, since
> 14.04.5 is the last point release of 14.04 this would not typically be a
> priority to address in SRU but rather we would encourage use of 16.04 for
> this case.
Fair enough. I'll go ahead and mark the bug as
Alternate_Solution_Available on the IBM side considering that the number
of new netboot installs of 14.04 should be minimal, and the bootloader
code from a 16.04 netboot image can always be used.
** Tags removed: targetmilestone-inin14045
** Tags added: targetmilestone-inin1604
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to debian-installer in Ubuntu.
https://bugs.launchpad.net/bugs/1598299
Title:
Ubuntu14.04.05 netboot installation fails with timeout errors due to
ignoring ARP update requests
Status in debian-installer package in Ubuntu:
New
Bug description:
== Comment: #0 - Manvanthara B. Puttashankar <mputtash at in.ibm.com> - 2016-06-30 07:26:13 ==
---Problem Description---
Ubuntu14.04.05 netboot installation fails with Baby Blue tip (Mellanox)
this issue looks similar to
https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1428005, reported
on 15.04.
netboot Server configuraitons:
the ubuntu packages were picked from:
http://ports.ubuntu.com/ubuntu-ports/dists/trusty-proposed/main
/installer-ppc64el/current/images/xenial-netboot/ubuntu-
installer/ppc64el/
linux-m0th:~ # cat /etc/dhcpd.conf
allow bootp;
allow booting;
max-lease-time 420;
default-lease-time 120;
ddns-update-style none;
always-reply-rfc1048 true;
ignore unknown-clients;
option conf-file code 209 = text;
log-facility local7;
subnet 9.47.64.0 netmask 255.255.240.0 {
allow bootp;
next-server 9.47.68.41;
option routers 9.47.79.254;
group {
host ltcalpine-lp7.pok.stglabs.ibm.com {
hardware ethernet F4:52:14:6C:16:C0;
#hardware ethernet ea:e3:86:8d:2f:02;
fixed-address 9.47.67.114;
option host-name "ltcalpine-lp7.pok.stglabs.ibm.com";
option tftp-server-name "9.47.68.41";
filename "ubuntu-installer/ppc64el/powerpc-ieee1275/core.elf";
}
}
}
linux-m0th:~ # cat /etc/xinetd.d/tftp
# default: off
# description: tftp service is provided primarily for booting or when a \
# router need an upgrade. Most sites run this only on machines acting as \
# "boot servers".
# The tftp protocol is often used to boot diskless \
# workstations, download configuration files to network-aware printers, \
# and to start the installation process for some operating systems.
service tftp
{
socket_type = dgram
protocol = udp
wait = yes
flags = IPv6 IPv4
user = root
server = /usr/sbin/in.tftpd
server_args = -u tftp -s /srv/tftpboot
# per_source = 11
# cps = 100 2
disable = no
}
linux-m0th:~ # cat /srv/tftpboot/ubuntu-installer/ppc64el/grub.cfg
set timeout=-1
menuentry "Install" {
linux ubuntu-installer/ppc64el/vmlinux tasks=standard pkgsel/language-pack-patterns= pkgsel/install-language-support=false --- quiet
initrd ubuntu-installer/ppc64el/initrd.gz
}
menuentry "Rescue mode" {
linux ubuntu-installer/ppc64el/vmlinux rescue/enable=true --- quiet
initrd ubuntu-installer/ppc64el/initrd.gz
}
client:
BOOTP Parameters:
----------------
server IP = 9.47.68.41
client IP = 9.47.67.114
gateway IP = 9.47.79.254
device = /pci at 800000020000040/pci15b3,1007 at 0/ethernet at 0
MAC address = f4 52 14 6c 16 c0
loc-code = U78C7.001.RCH0040-P1-C1-T1
BOOTP request retry attempt: 1
BOOTP request retry attempt: 2
BOOTP request retry attempt: 3
TFTP BOOT ---------------------------------------------------
Server IP.....................9.47.68.41
Client IP.....................9.47.67.114
Gateway IP....................9.47.79.254
Subnet Mask...................255.255.240.0
( 1 ) Filename.................ubuntu-installer/ppc64el/powerpc-ieee1275/core.elf
TFTP Retries..................5
Block Size....................512
FINAL PACKET COUNT = 302
FINAL FILE SIZE = 154456 BYTES
Elapsed time since release of system processors: 80 mins 56 secs
GNU GRUB version 2.02~beta2-9ubuntu1.8
+----------------------------------------------------------------------------+
|*Install |
| Rescue mode |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
+----------------------------------------------------------------------------+
Use the ^ and v keys to select which entry is highlighted.
Press enter to boot the selected OS, `e' to edit the commands
before booting or `c' for a command-line.
error: timeout reading `ubuntu-installer/ppc64el/vmlinux'.
error: you need to load the kernel first.
Press any key to continue...
GNU GRUB version 2.02~beta2-9ubuntu1.8
+----------------------------------------------------------------------------+
|setparams 'Install' |
| |
| linux ubuntu-installer/ppc64el/vmlinux tasks=standard pkgsel\|
|/language-pack-patterns= pkgsel/install-language-support=false --- quiet |
| initrd ubuntu-installer/ppc64el/initrd.gz |
| |
| |
| |
| |
| |
| |
| |
+----------------------------------------------------------------------------+
Minimum Emacs-like screen editing is supported. TAB lists
completions. Press Ctrl-x or F10 to boot, Ctrl-c or F2 for
a command-line or ESC to discard edits and return to the GRUB menu.
error: timeout reading `ubuntu-installer/ppc64el/vmlinux'.
error: you need to load the kernel first.
Press any key to continue...
---uname output---
4.4.0-28-generic
Machine Type = s822l
---boot type---
Network boot
---bootloader---
grub
---Kernel cmdline used to launch install---
set timeout=-1
menuentry
---Bootloader protocol---
tftp
---Install repository type---
Internet repository
---Install repository Location---
ports.ubuntu.com
---Point of failure---
Other failure during installation (stage 1)
== Comment: #4 - Kevin W. Rudd - 2016-06-30 18:56:02 ==
I was able to gather some network traces during one of these failed installs.
The lpar stops responding to ARP requests. This appears to be the
real killer here. The boot process proceeds to the point of getting
the grub.cfg file, but the remote server's arp entry eventually times
out, and the connection stalls:
...
9829 422.498081 9.47.68.41 -> 9.47.67.114 TFTP 1070 Data Packet, Block: 4295
9830 422.498456 9.47.67.114 -> 9.47.68.41 TFTP 60 Acknowledgement, Block: 4295
9831 422.498470 9.47.68.41 -> 9.47.67.114 TFTP 1070 Data Packet, Block: 4296
9832 422.498853 9.47.67.114 -> 9.47.68.41 TFTP 60 Acknowledgement, Block: 4296
9833 422.498873 e4:1d:2d:10:92:40 -> ff:ff:ff:ff:ff:ff ARP 42 Who has 9.47.67.114? Tell 9.47.68.41
9834 423.498762 e4:1d:2d:10:92:40 -> ff:ff:ff:ff:ff:ff ARP 42 Who has 9.47.67.114? Tell 9.47.68.41
9835 424.498778 e4:1d:2d:10:92:40 -> ff:ff:ff:ff:ff:ff ARP 42 Who has 9.47.67.114? Tell 9.47.68.41
...
As a test, I fixed the ARP entry on the tftp/NFS server, and was able
to boot into the installer. The lpar is currently sitting in the
installer waiting for further instructions.
== Comment: #11 - Kevin W. Rudd - 2016-07-01 16:51:31 ==
This issue does seem to be specific to the grub code found in the trusty-xenial.318.39 netboot image:
http://ports.ubuntu.com/ubuntu-ports/dists/trusty-proposed/main
/installer-ppc64el/20101020ubuntu318.39/images/xenial-netboot/
Since it was reported that a 16.04.01 install worked on this lpar, I
created a hybrid ubuntu-installer directory where the ubuntu-
installer/ppc64el/powerpc-ieee1275 directory pointed to images pulled
from the following xenial image:
http://ports.ubuntu.com/ubuntu-ports/dists/xenial-proposed/main
/installer-ppc64el/20101020ubuntu451.2/images/netboot/
This hybrid netboot structure worked just fine, and grub properly
responded to ARP requests.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/debian-installer/+bug/1598299/+subscriptions
More information about the foundations-bugs
mailing list