[Bug 1958389]
Alan Modra
1958389 at bugs.launchpad.net
Wed Feb 9 11:11:12 UTC 2022
Fixed mainline and 2.38 branch
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to binutils in Ubuntu.
https://bugs.launchpad.net/bugs/1958389
Title:
Jammy builds of xen segfault, but only on launchpad x86 builders
Status in binutils:
Fix Released
Status in launchpad-buildd:
New
Status in binutils package in Ubuntu:
Fix Released
Status in xen package in Ubuntu:
Invalid
Status in binutils source package in Jammy:
Fix Released
Status in xen source package in Jammy:
Invalid
Status in binutils package in Debian:
Fix Released
Bug description:
FTBFS in Jammy on LP infra:
https://launchpadlibrarian.net/580924961/buildlog_ubuntu-jammy-amd64.xen_4.16.0-1~ubuntu1~jammyppa4_BUILDING.txt.gz
https://launchpadlibrarian.net/581060687/buildlog_ubuntu-jammy-amd64.xen_4.16.0-1~ubuntu1~jammyppa6_BUILDING.txt.gz
Related PPA:
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4760/+packages
Summary:
- Build reliably fails on LP
- Build in local sbuild works reliably on my Laptop
- Build in local VM (sizing like LP builders) works (other crashes but works)
- Build on AMD server (chip more similar to LP) works reliably
Failing step:
On Launchpad build infrastructure it breaks on ld:
$ x86_64-linux-gnu-ld -mi386pep --subsystem=10 --image-base=0xffff82d040000000 --stack=0,0 --heap=0,0 --section-alignment=0x200000 --file-alignment=0x20 --major-image-version=4 --minor-image-version=16 --major-os-version=2 --minor-os-version=0 --major-subsystem-version=2 --minor-subsystem-version=0 --no-insert-timestamp --build-id=sha1 -T efi.lds -N prelink.o /<<PKGBUILDDIR>>/xen/common/symbols-dummy.o -b pe-x86-64 efi/buildid.o -o /<<PKGBUILDDIR>>/xen/.xen.efi.0xffff82d040000000.0 && :
Segmentation fault (core dumped
---
Steps to recreate (result depends on platform)
# you can grab the package from https://launchpad.net/~ci-train-ppa-
service/+archive/ubuntu/4760/+packages
sudo vim /etc/apt/sources.list
sudo apt update
sudo apt dist-upgrade -y
sudo apt build-dep xen
sudo apt install flex bison python3-dev libpython3-dev dpkg-dev devscripts apport-retrace
sudo mkdir /mnt/build
sudo chmod go+w /mnt/build
cd /mnt/build
# copy in things from host
scp xen_4.16.0-1~ubuntu1~jammyppa6.dsc xen_4.16.0-1~ubuntu1~jammyppa6.debian.tar.xz xen_4.16.0.orig.tar.bz2 ubuntu@<TODO>:/mnt/build
dpkg-source -x xen_4.16.0-1~ubuntu1~jammyppa6.dsc xen_4.16.0
cd xen_4.16.0
dpkg-buildpackage -i -us -uc -b
---
In a jammy VM 4cpu/8G I get some avx2 crashes but the build works:
Jan 19 07:41:27 j kernel: x86_64-linux-gn[130016]: segfault at 0 ip 00007f189432ef3d sp 00007ffc8e2361d8 error 4 in libc.so.6[7f18941bb000+194000]
Jan 19 07:41:27 j kernel: Code: f8 77 c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 89 f8 48 89 fa c5 f9 ef c0 25 ff 0f 00 00 3d e0 0f 00 00 0f 87 33 01 00 00 <c5> fd 74 0f c5 fd d7 c1 85 c0 74 57 f3 0f bc c0 c5 f8 77 c3 66 66
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
74 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
(gdb) bt
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1 0x00007fa98d63c2d0 in ?? () from /lib/x86_64-linux-gnu/libbfd-2.37.50-system.20220106.so
#2 0x00007fa98d6021e8 in ?? () from /lib/x86_64-linux-gnu/libbfd-2.37.50-system.20220106.so
#3 0x00007fa98d602509 in coff_write_alien_symbol () from /lib/x86_64-linux-gnu/libbfd-2.37.50-system.20220106.so
#4 0x00007fa98d6033bd in _bfd_coff_final_link () from /lib/x86_64-linux-gnu/libbfd-2.37.50-system.20220106.so
#5 0x0000562bdaaae3bf in ?? ()
#6 0x00007fa98d2e8fd0 in __libc_start_call_main (main=main at entry=0x562bdaaad5e0, argc=argc at entry=8, argv=argv at entry=0x7ffc797f2968) at ../sysdeps/nptl/libc_start_call_main.h:58
#7 0x00007fa98d2e907d in __libc_start_main_impl (main=0x562bdaaad5e0, argc=8, argv=0x7ffc797f2968, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7ffc797f2958) at ../csu/libc-start.c:409
#8 0x0000562bdaaad515 in ?? ()
^^ that is a different crash than on th LP builders
! And despite those crashes the build does appear to work oO?!
The same crashes I see on my local sbuild runs, the full set of one build is
Jan 19 07:39:02 Keschdeichel kernel: x86_64-linux-gn[4131180]: segfault at 0 ip 00007f566e8b3f3d sp 00007ffde04b75a8 error 4 in libc.so.6[7f566e740000+194000]
Jan 19 07:39:03 Keschdeichel kernel: x86_64-linux-gn[4131332]: segfault at 0 ip 00007fbba26e4f3d sp 00007fffab8a5b68 error 4 in libc.so.6[7fbba2571000+194000]
Jan 19 07:39:03 Keschdeichel kernel: x86_64-linux-gn[4131382]: segfault at 0 ip 00007fe3681b7f3d sp 00007ffcbbf16628 error 4 in libc.so.6[7fe368044000+194000]
Jan 19 07:39:42 Keschdeichel kernel: x86_64-linux-gn[4134584]: segfault at 0 ip 00007f241f455f3d sp 00007ffd05c2e7c8 error 4 in libc.so.6[7f241f2e2000+194000]
Jan 19 07:44:57 Keschdeichel kernel: x86_64-linux-gn[4171794]: segfault at 0 ip 00007fcbe1f2bf3d sp 00007fff62005aa8 error 4 in libc.so.6[7fcbe1db8000+194000]
Jan 19 07:44:57 Keschdeichel kernel: x86_64-linux-gn[4172028]: segfault at 0 ip 00007f601dfa3f3d sp 00007ffe67ca2788 error 4 in libc.so.6[7f601de30000+194000]
Jan 19 07:44:58 Keschdeichel kernel: x86_64-linux-gn[4172154]: segfault at 0 ip 00007f1bfabb7f3d sp 00007ffe5ce9dfb8 error 4 in libc.so.6[7f1bfaa44000+194000]
Jan 19 07:45:05 Keschdeichel kernel: x86_64-linux-gn[4174536]: segfault at 0 ip 00007f0f48986f3d sp 00007ffc9e72ea48 error 4 in libc.so.6[7f0f48813000+194000]
I checked, this is not in configure stage where such things sometimes
are intentional.
Running in local VM with reduced cpu features (e.g. no avx2) still triggers
the same bfd issues, but still works to build.
---
The LP run is on a Rome chip, from the build env:
Model name: AMD EPYC-Rome Processor
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl xtopology cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat npt nrip_save umip rdpid
So I thought I might need to re-create the build on such a chip to check
if it fails there too.
Running on Riccioli from the kernel Team as similar HW (AMD EPYC 7713)
works fine (like my local build does, this time without any crashes)
---
I do not know how to continue, repro on laptop, repro in VM guests, repro on
AMD servers similar to the build farm, ... they all build the package.
But on launchpad it crashes with the reported error.
Is it the toolchain that needs a fix, is it the launchpad builder setup, both?
I do not know ... :-/
Filing this against xen+binutils+launchpad-buildd
To manage notifications about this bug go to:
https://bugs.launchpad.net/binutils/+bug/1958389/+subscriptions
More information about the foundations-bugs
mailing list