[Bug 2078917] [NEW] Ubuntu 24.04 - It cannot be installed with DL380a Gen12 (2P, SRF-SP) + NVidia L40 GPU in slot17

David Chang 2078917 at bugs.launchpad.net
Wed Sep 4 08:07:46 UTC 2024


Public bug reported:

Description:
Failed to install Ubuntu 24.04 on a DL380a Gen12 with Intel Sierra Forest 2P + NVidia L40 GPU in slot17.

There is a random write to VF BAR0's memory region that causes the
kernel got MCE error.


Version-Release number :
Ubuntu 24.04

How reproducible:
Each time

Steps to reproduce
- PCI segment, Intel VT-d and SR-IOV , all enabled in the BIOS
- Run a fresh install on a DL380a server with 2P with GPU (NVidia L40) in slot17

Expected results
No MCE and run installation w/o problem

Actual results
The kernel got MCE errors.


Additional info:

We have tracked this issue with RHEL9.4, it's caused by the following
pathes.

cb4a6ccf3583 perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge (v6.8-rc1)
388d76175bd9 perf/x86/intel/uncore: Support IIO free-running counters on GNR (v6.8-rc1)
632c4bf6d007 perf/x86/intel/uncore: Support Granite Rapids (v6.8-rc1)
b560e0cd882b perf/x86/uncore: Use u64 to replace unsigned for the uncore offsets array (v6.8-rc1)
cf35791476fc perf/x86/intel/uncore: Generic uncore_get_uncores and MMIO format of SPR (v6.8-rc1)

** Affects: mdadm (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to mdadm in Ubuntu.
https://bugs.launchpad.net/bugs/2078917

Title:
  Ubuntu 24.04 -  It cannot be installed with DL380a Gen12 (2P, SRF-SP)
  + NVidia L40 GPU in slot17

Status in mdadm package in Ubuntu:
  New

Bug description:
  Description:
  Failed to install Ubuntu 24.04 on a DL380a Gen12 with Intel Sierra Forest 2P + NVidia L40 GPU in slot17.

  There is a random write to VF BAR0's memory region that causes the
  kernel got MCE error.

  
  Version-Release number :
  Ubuntu 24.04

  How reproducible:
  Each time

  Steps to reproduce
  - PCI segment, Intel VT-d and SR-IOV , all enabled in the BIOS
  - Run a fresh install on a DL380a server with 2P with GPU (NVidia L40) in slot17

  Expected results
  No MCE and run installation w/o problem

  Actual results
  The kernel got MCE errors.

  
  Additional info:

  We have tracked this issue with RHEL9.4, it's caused by the following
  pathes.

  cb4a6ccf3583 perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge (v6.8-rc1)
  388d76175bd9 perf/x86/intel/uncore: Support IIO free-running counters on GNR (v6.8-rc1)
  632c4bf6d007 perf/x86/intel/uncore: Support Granite Rapids (v6.8-rc1)
  b560e0cd882b perf/x86/uncore: Use u64 to replace unsigned for the uncore offsets array (v6.8-rc1)
  cf35791476fc perf/x86/intel/uncore: Generic uncore_get_uncores and MMIO format of SPR (v6.8-rc1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/2078917/+subscriptions




More information about the foundations-bugs mailing list