Tip regression - bus error (bisected)

Jeffrey Hugo jhugo at codeaurora.org
Fri Jul 21 02:18:37 UTC 2017


I noticed a consistent bus error when running tip on a ARM64 platform -

ubuntu at ubuntu:~$ sudo fwts
Running 71 tests, results appended to results.log
Test: Gather kernel system information.
   Gather kernel signature.                                1 skipped, 1 
info only
   Gather kernel system information.                       1 info only
   Gather kernel boot command line.                        1 info only
   Gather ACPI driver version.                             1 info only
Test: OPAL Processor Power Management DT Validation Tests
   Test skipped, missing features: devicetree
Test: OPAL Reserved memory DT Validation Test
   Test skipped, missing features: devicetree
Test: OPAL Processor Recovery Diagnostics Info
   Test skipped, missing features: devicetree
Test: Scan kernel log for Oopses.
   Kernel log oops check.                                  2 passed
Test: Run OLOG scan and analysis checks.
  Test skipped.
Test: Scan kernel log for errors and warnings.
   Kernel log error check.                                 1 passed
Test: BMC Info
   BMC Info                                                1 passed
Test: General ACPI information test.
   Determine Kernel ACPI version.                          1 info only
   Determine machine's ACPI version.                                  : 
11.7% |
Caught SIGNAL 7 (Bus error), aborting.
Backtrace:
0x0000ffff7eae09e4 /usr/local/lib/fwts/libfwts.so.1(+0x109e4)

I bisected the issue down to this commit -

commit cc3ea59404ef2bb89e40556bce8a8d803b39d3ce
Author: Colin Ian King <colin.king at canonical.com>
Date:   Fri Jul 14 09:35:10 2017 +0100

     lib: fwts_safe_mem: remove need to copy into a buffer

     While fwts_safe_memread() works fine as it is, it is copying data
     to the stack and we don't guard how big that copy can be, so we
     potentially could get a segfault if we run out of stack. Instead
     just read the data. Force gcc not to optimize out the reads by
     using volatile.

     Signed-off-by: Colin Ian King <colin.king at canonical.com>
     Acked-by: Alex Hung <alex.hung at canonical.com>
     Acked-by: Ivan Hu <ivan.hu at canonical.com>

I haven't really looked further into the issue, but is there additional 
information that would be useful to root cause and fix?

-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



More information about the fwts-devel mailing list