[Bug 1437353] Re: UEFI network boot hangs at grub for adapter 82599ES 10-Gigabit SFI/SFP+

Mathieu Trudel-Lapierre mathieu.tl at gmail.com
Wed May 16 14:04:04 UTC 2018


Well, the CI part confirms that there is no regression, but there is as
yet no indication that the issue is fixed aside from the cases where
firmware was updated (but then, it's not the SRU).

There's still a need to verify the fix positively on affected hardware.

Now, KingJ's comment says that testing has been done with the version of
grub in bionic, and with the version of grub being SRUed to xenial, and
neither did work. I think that qualifies as verification-failed, and
we'll need to have another look at grub's behavior. At this point, there
is no obvious patch missing, and we'll need to debug using packet
captures to try and make sense of it.

** Tags removed: verification-done verification-done-xenial
** Tags added: verification-failed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to grub2 in Ubuntu.
https://bugs.launchpad.net/bugs/1437353

Title:
  UEFI network boot hangs at grub for adapter 82599ES 10-Gigabit
  SFI/SFP+

Status in MAAS:
  Invalid
Status in maas-images:
  Triaged
Status in python-tx-tftp:
  Invalid
Status in grub2 package in Ubuntu:
  Fix Released
Status in grub2-signed package in Ubuntu:
  New
Status in grub2 source package in Trusty:
  New
Status in grub2-signed source package in Trusty:
  New
Status in grub2 source package in Xenial:
  Fix Committed
Status in grub2-signed source package in Xenial:
  Fix Committed
Status in grub2 source package in Yakkety:
  Won't Fix
Status in grub2-signed source package in Yakkety:
  Won't Fix

Bug description:
  [Impact]
  MAAS commissioning may fail when deploying Xenial images or using grubx64.efi from Xenial due to hardware particularities of some Intel 82599-based network cards. Other network manufacturers may be affected as well. The main failure mode appears to be an infinite re-send of some packets because of an unexpected response from the network hardware.

  [Test case]
  1) Attempt to netboot on a system with a "82599ES 10-Gigabit SFI/SFP+" network adapter; in UEFI mode.
  2) Validate that netbooting happens correctly, passing control over to the kernel as configured in grub.cfg.

  3) Validate that netbooting another system, not using an Intel 82599
  adapter, behaves normally when booting in UEFI mode.

  4) Validate that netbooting another system, not using an Intel 82599
  adapter, behaves normally when booting in LEGACY mode.

  [Regression potential]
  As this affects network in EFI mode; any failure to netboot using EFI should be considered a possible regression. Systems may fail to receive data from the network boot server and terminate the process with a timeout. Another possible failure scenario is to fail to receive complete data over the network, or data corruption.

  ----

  I am using MAAS to commission and install machines. When I attempt to commission a machine with a "82599ES 10-Gigabit SFI/SFP+" network adapter the following happens:
  1) TFTP Request — bootx64.efi
  2) TFTP Request — /grubx64.efi
  3) Console hangs at grub prompt

  If I go into bios and force the adapter above into legacy mode then the machine is able to network boot and run through the commission process.
  1) TFTP Request — ubuntu/amd64/generic/trusty/release/boot-initrd
  2) TFTP Request — ubuntu/amd64/generic/trusty/release/boot-kernel
  3) TFTP Request — ifcpu64.c32
  4) PXE Request — power off
  5) TFTP Request — pxelinux.cfg/01-90-e2-ba-52-23-78
  6) TFTP Request — pxelinux.cfg/71e3f102-bd8b-11e4-b634-3c18a001c80a
  7) TFTP Request — pxelinux.0

  Also, if I disconnect the cable to the adapter above and connect a
  cable to the integrated "I210 Gigabit" adapter which is configured for
  UEFI mode. The machine is able to network boot grubx64.efi and run
  through the commission process.

  ~$ dpkg -l '*maas*'|cat
  Desired=Unknown/Install/Remove/Purge/Hold
  | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
  |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
  ||/ Name                                  Version                            Architecture Description
  +++-=====================================-==================================-============-===============================================================================
  ii  maas                                  1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS server all-in-one metapackage
  ii  maas-cli                              1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS command line API tool
  ii  maas-cluster-controller               1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS server cluster controller
  ii  maas-common                           1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS server common files
  ii  maas-dhcp                             1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS DHCP server
  ii  maas-dns                              1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS DNS server
  ii  maas-proxy                            1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS Caching Proxy
  ii  maas-region-controller                1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS server complete region controller
  ii  maas-region-controller-min            1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS Server minimum region controller
  ii  python-django-maas                    1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS server Django web framework
  ii  python-maas-client                    1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS python API client
  ii  python-maas-provisioningserver        1.7.2+bzr3355-0ubuntu1~trusty1     all          MAAS server provisioning libraries
  ~$

To manage notifications about this bug go to:
https://bugs.launchpad.net/maas/+bug/1437353/+subscriptions



More information about the foundations-bugs mailing list