APPLIED: [SRU][N:gcp][PATCH 000/109] Enable NVIDIA Grace platform for Google Cloud Project

Tim Whisonant tim.whisonant at canonical.com
Tue Jun 24 21:35:40 UTC 2025


On Mon, Jun 16, 2025 at 05:43:16PM -0700, Tim Whisonant wrote:
> BugLink: https://bugs.launchpad.net/bugs/2111859
> 
> SRU Justification
> 
> [Impact]
> 
> Google requested a Grace-enabled 6.8 kernel for GCP. This patchset targets
> noble:linux-gcp. The derivative kernel jammy:linux-gcp-6.8 will inherit
> these patches and will be the primary deployment kernel.
> 
> [Fix]
> 
> Add select NVIDIA Grace platform patches, similar to what has been done
> for kernels like noble:linux-azure-nvidia. NVIDIA provides a
> document [1] listing all of the required and recommended kernel patches for
> Grace enablement. Version 11 April 25, 2025 of the document was used
> when preparing this patchset as was version
> Ubuntu-azure-nvidia-6.8.0-1016.17 of the noble:linux-azure-nvidia kernel.
> 
> [1] https://docs.nvidia.com/grace-patch-config-guide.pdf
> 
> [Test Plan]
> 
> Boot testing on arm64/64k pages was performed in house. Further testing
> will be done by Google in their Grace-enabled environment once available.
> 
> [Regression potential]
> 
> The regression potential is considered moderate due to the number of
> patches involved and the breadth of the changes. The changes affect
> pte, perf, PCI, cpufreq, ACPI, mmu, coresight, and other kernel
> subsystems.
> 
> [Other]
> 
> SF #00411226
> 
> Alexey Kardashevskiy (1):
>   PCI/DOE: Support discovery version 2
> 
> Arnd Bergmann (1):
>   arm64/io: add constant-argument check
> 
> Aubrey Li (1):
>   ACPI: PRM: Remove unnecessary strict handler address checks
> 
> Barry Song (1):
>   mm: make folio_pte_batch available outside of mm/memory.c
> 
> Beata Michalska (6):
>   arm64: amu: Delay allocating cpumask for AMU FIE support
>   cpufreq: Allow arch_freq_get_on_cpu to return an error
>   cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
>   arm64: Provide an AMU-based version of arch_freq_get_on_cpu
>   arm64: Update AMU-based freq scale factor on entering idle
>   arm64: Utilize for_each_cpu_wrap for reference lookup
> 
> Besar Wicaksono (5):
>   perf arm-spe: Add Neoverse-V2 to common data source encoding list
>   perf: arm_cspmu: nvidia: remove unsupported SCF events
>   perf: arm_cspmu: nvidia: fix sysfs path in the kernel doc
>   perf: arm_cspmu: nvidia: enable NVLINK-C2C port filtering
>   perf: arm_cspmu: nvidia: monitor all ports by default
> 
> Dan Williams (1):
>   ACPI/HMAT: Move HMAT messages to pr_debug()
> 
> David Hildenbrand (14):
>   arm/pgtable: define PFN_PTE_SHIFT
>   nios2/pgtable: define PFN_PTE_SHIFT
>   powerpc/pgtable: define PFN_PTE_SHIFT
>   riscv/pgtable: define PFN_PTE_SHIFT
>   s390/pgtable: define PFN_PTE_SHIFT
>   sparc/pgtable: define PFN_PTE_SHIFT
>   mm/pgtable: make pte_next_pfn() independent of set_ptes()
>   arm/mm: use pte_next_pfn() in set_ptes()
>   powerpc/mm: use pte_next_pfn() in set_ptes()
>   mm/memory: factor out copying the actual PTE in copy_present_pte()
>   mm/memory: pass PTE to copy_present_pte()
>   mm/memory: optimize fork() with PTE-mapped THP
>   mm/memory: ignore dirty/accessed/soft-dirty bits in folio_pte_batch()
>   mm/memory: ignore writable bit in folio_pte_batch()
> 
> Gavin Shan (2):
>   arm64: tlb: Improve __TLBI_VADDR_RANGE()
>   arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES
> 
> Ian Rogers (1):
>   perf arm-spe/cs-etm: Directly iterate CPU maps
> 
> Ilkka Koskinen (1):
>   perf cs-etm: Fix the assert() to handle captured and unprocessed cpu
>     trace
> 
> Ionela Voinescu (1):
>   arch_topology: init capacity_freq_ref to 0
> 
> James Clark (30):
>   coresight: Remove unused ETM Perf stubs
>   coresight: Clarify comments around the PID of the sink owner
>   coresight: Move struct coresight_trace_id_map to common header
>   coresight: Expose map arguments in trace ID API
>   coresight: Make CPU id map a property of a trace ID map
>   coresight: Make language around "activated" sinks consistent
>   coresight: Remove ops callback checks
>   coresight: Move mode to struct coresight_device
>   coresight: Remove the 'enable' field.
>   coresight: Move all sysfs code to sysfs file
>   coresight: Remove atomic type from refcnt
>   coresight: Remove unused stubs
>   coresight: Add explicit member initializers to coresight_dev_type
>   coresight: Add helper for atomically taking the device
>   coresight: Add a helper for getting csdev->mode
>   coresight: Use per-sink trace ID maps for Perf sessions
>   coresight: Remove pending trace ID release mechanism
>   coresight: Emit sink ID in the HW_ID packets
>   coresight: Make trace ID map spinlock local to the map
>   perf auxtrace: Allow number of queues to be specified
>   perf cs-etm: Print error for new PERF_RECORD_AUX_OUTPUT_HW_ID versions
>   perf cs-etm: Use struct perf_cpu as much as possible
>   perf cs-etm: Create decoders after both AUX and HW_ID search passes
>   perf: cs-etm: Allocate queues for all CPUs
>   perf: cs-etm: Move traceid_list to each queue
>   perf: cs-etm: Create decoders based on the trace ID mappings
>   perf: cs-etm: Only save valid trace IDs into files
>   perf: cs-etm: Support version 0.1 of HW_ID packets
>   perf: cs-etm: Print queue number in raw trace dump
>   perf arm-spe: Use old behavior when opening old SPE files
> 
> Jason Gunthorpe (6):
>   arm64/io: Provide a WC friendly __iowriteXX_copy()
>   net: hns3: Remove io_stop_wc() calls after __iowrite64_copy()
>   x86: Stop using weak symbols for __iowrite32_copy()
>   s390: Implement __iowrite32_copy()
>   s390: Stop using weak symbols for __iowrite64_copy()
>   PCI: Fix pci_enable_acs() support for the ACS quirks
> 
> Jie Zhan (1):
>   cppc_cpufreq: Remove HiSilicon CPPC workaround
> 
> Kai-Heng Feng (1):
>   PCI: Use downstream bridges for distributing resources
> 
> Leo Yan (8):
>   perf arm-spe: Rename arm_spe__synth_data_source_generic()
>   perf arm-spe: Rename the common data source encoding
>   perf arm-spe: Support metadata version 2
>   perf arm-spe: Introduce arm_spe__is_homogeneous()
>   perf arm-spe: Use metadata to decide the data source feature
>   perf arm-spe: Remove the unused 'midr' field
>   perf arm-spe: Add Cortex CPUs to common data source encoding list
>   perf arm-spe: Define metadata header version 2
> 
> Namhyung Kim (1):
>   tools/include: Sync arm64 headers with the kernel sources
> 
> Petr Vaněk (1):
>   mm: fix folio_pte_batch() on XEN PV
> 
> Piotr Jaroszynski (1):
>   Fix mmu notifiers for range-based invalidates
> 
> Ryan Roberts (20):
>   arm64/mm: make set_ptes() robust when OAs cross 48-bit boundary
>   mm: clarify the spec for set_ptes()
>   mm: thp: batch-collapse PMD with set_ptes()
>   mm: introduce pte_advance_pfn() and use for pte_next_pfn()
>   arm64/mm: convert pte_next_pfn() to pte_advance_pfn()
>   x86/mm: convert pte_next_pfn() to pte_advance_pfn()
>   mm: tidy up pte_next_pfn() definition
>   arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)
>   arm64/mm: convert set_pte_at() to set_ptes(..., 1)
>   arm64/mm: convert ptep_clear() to ptep_get_and_clear()
>   arm64/mm: new ptep layer to manage contig bit
>   arm64/mm: dplit __flush_tlb_range() to elide trailing DSB
>   arm64/mm: wire up PTE_CONT for user mappings
>   arm64/mm: implement new wrprotect_ptes() batch API
>   arm64/mm: implement new [get_and_]clear_full_ptes() batch APIs
>   mm: add pte_batch_hint() to reduce scanning in folio_pte_batch()
>   arm64/mm: implement pte_batch_hint()
>   arm64/mm: __always_inline to improve fork() perf
>   arm64/mm: automatically fold contpte mappings
>   arm64/mm: export contpte symbols only to GPL users
> 
> Tim Whisonant (2):
>   UBUNTU: [Packaging] gcp: enable CONFIG_CPUFREQ_ARCH_CUR_FREQ
>   UBUNTU: [Packaging] gcp: enable CONFIG_ARM64_CONTPTE
> 
> Tushar Dave (1):
>   PCI/ACS: Fix 'pci=config_acs=' parameter
> 
> Vidya Sagar (2):
>   PCI: Clear Secondary Status errors after enumeration
>   PCI: Extend ACS configurability
> 
>  .../admin-guide/kernel-parameters.txt         |  32 +
>  Documentation/admin-guide/perf/nvidia-pmu.rst |  52 +-
>  Documentation/admin-guide/pm/cpufreq.rst      |  17 +-
>  arch/arm/include/asm/pgtable.h                |   2 +
>  arch/arm/mm/mmu.c                             |   2 +-
>  arch/arm64/Kconfig                            |   9 +
>  arch/arm64/include/asm/io.h                   | 128 ++++
>  arch/arm64/include/asm/pgtable.h              | 431 ++++++++++--
>  arch/arm64/include/asm/tlbflush.h             |  68 +-
>  arch/arm64/kernel/efi.c                       |   4 +-
>  arch/arm64/kernel/io.c                        |  42 ++
>  arch/arm64/kernel/mte.c                       |   2 +-
>  arch/arm64/kernel/topology.c                  | 150 ++++-
>  arch/arm64/kvm/guest.c                        |   2 +-
>  arch/arm64/mm/Makefile                        |   1 +
>  arch/arm64/mm/contpte.c                       | 404 +++++++++++
>  arch/arm64/mm/fault.c                         |  12 +-
>  arch/arm64/mm/fixmap.c                        |   4 +-
>  arch/arm64/mm/hugetlbpage.c                   |  40 +-
>  arch/arm64/mm/kasan_init.c                    |   6 +-
>  arch/arm64/mm/mmu.c                           |  16 +-
>  arch/arm64/mm/pageattr.c                      |   6 +-
>  arch/arm64/mm/trans_pgd.c                     |   6 +-
>  arch/nios2/include/asm/pgtable.h              |   2 +
>  arch/powerpc/include/asm/pgtable.h            |   2 +
>  arch/powerpc/mm/pgtable.c                     |   5 +-
>  arch/riscv/include/asm/pgtable.h              |   2 +
>  arch/s390/include/asm/io.h                    |  15 +
>  arch/s390/include/asm/pgtable.h               |   2 +
>  arch/s390/pci/pci.c                           |   6 -
>  arch/sparc/include/asm/pgtable_64.h           |   2 +
>  arch/x86/include/asm/io.h                     |  17 +
>  arch/x86/include/asm/pgtable.h                |   8 +-
>  arch/x86/kernel/cpu/aperfmperf.c              |   2 +-
>  arch/x86/kernel/cpu/proc.c                    |   7 +-
>  arch/x86/lib/Makefile                         |   1 -
>  arch/x86/lib/iomap_copy_64.S                  |  15 -
>  debian.gcp/config/annotations                 |   6 +
>  drivers/acpi/numa/hmat.c                      |  24 +-
>  drivers/acpi/prmt.c                           |   4 +-
>  drivers/base/arch_topology.c                  |   8 +-
>  drivers/cpufreq/Kconfig.x86                   |  12 +
>  drivers/cpufreq/cppc_cpufreq.c                |  73 +-
>  drivers/cpufreq/cpufreq.c                     |  38 +-
>  drivers/hwtracing/coresight/coresight-core.c  | 515 +-------------
>  drivers/hwtracing/coresight/coresight-dummy.c |   3 +-
>  drivers/hwtracing/coresight/coresight-etb10.c |  29 +-
>  .../hwtracing/coresight/coresight-etm-perf.c  |  43 +-
>  .../hwtracing/coresight/coresight-etm-perf.h  |  18 -
>  drivers/hwtracing/coresight/coresight-etm.h   |   2 -
>  .../coresight/coresight-etm3x-core.c          |  32 +-
>  .../coresight/coresight-etm3x-sysfs.c         |   4 +-
>  .../coresight/coresight-etm4x-core.c          |  35 +-
>  drivers/hwtracing/coresight/coresight-etm4x.h |   1 -
>  drivers/hwtracing/coresight/coresight-priv.h  |   8 +-
>  drivers/hwtracing/coresight/coresight-stm.c   |  33 +-
>  drivers/hwtracing/coresight/coresight-sysfs.c | 392 +++++++++++
>  .../hwtracing/coresight/coresight-tmc-core.c  |   2 +-
>  .../hwtracing/coresight/coresight-tmc-etf.c   |  46 +-
>  .../hwtracing/coresight/coresight-tmc-etr.c   |  38 +-
>  drivers/hwtracing/coresight/coresight-tmc.h   |   7 +-
>  drivers/hwtracing/coresight/coresight-tpda.c  |  13 +-
>  drivers/hwtracing/coresight/coresight-tpdm.c  |   3 +-
>  drivers/hwtracing/coresight/coresight-tpiu.c  |  14 +-
>  .../hwtracing/coresight/coresight-trace-id.c  | 138 ++--
>  .../hwtracing/coresight/coresight-trace-id.h  |  70 +-
>  drivers/hwtracing/coresight/ultrasoc-smb.c    |  22 +-
>  drivers/hwtracing/coresight/ultrasoc-smb.h    |   2 -
>  .../net/ethernet/hisilicon/hns3/hns3_enet.c   |   4 -
>  drivers/pci/doe.c                             |  12 +-
>  drivers/pci/pci.c                             | 160 +++--
>  drivers/pci/probe.c                           |   3 +
>  drivers/pci/setup-bus.c                       |   3 +-
>  drivers/perf/arm_cspmu/nvidia_cspmu.c         |  75 +--
>  include/linux/coresight-pmu.h                 |  17 +-
>  include/linux/coresight.h                     | 151 ++---
>  include/linux/cpufreq.h                       |   2 +-
>  include/linux/efi.h                           |   5 +
>  include/linux/io.h                            |   8 +-
>  include/linux/pgtable.h                       |  65 +-
>  include/uapi/linux/pci_regs.h                 |   1 +
>  lib/iomap_copy.c                              |  13 +-
>  mm/huge_memory.c                              |  58 +-
>  mm/internal.h                                 |  88 +++
>  mm/memory.c                                   | 143 ++--
>  tools/arch/arm64/include/asm/cputype.h        |  10 +
>  tools/include/linux/coresight-pmu.h           |  17 +-
>  tools/perf/arch/arm/util/cs-etm.c             | 307 ++++-----
>  tools/perf/arch/arm64/util/arm-spe.c          |   8 +-
>  .../util/arm-spe-decoder/arm-spe-decoder.h    |  18 +-
>  tools/perf/util/arm-spe.c                     | 234 ++++++-
>  tools/perf/util/arm-spe.h                     |  38 +-
>  tools/perf/util/auxtrace.c                    |   9 +-
>  tools/perf/util/auxtrace.h                    |   1 +
>  .../perf/util/cs-etm-decoder/cs-etm-decoder.c |  36 +-
>  .../perf/util/cs-etm-decoder/cs-etm-decoder.h |   2 +-
>  tools/perf/util/cs-etm.c                      | 631 +++++++++++-------
>  tools/perf/util/cs-etm.h                      |  12 +-
>  98 files changed, 3414 insertions(+), 1874 deletions(-)
>  create mode 100644 arch/arm64/mm/contpte.c
>  delete mode 100644 arch/x86/lib/iomap_copy_64.S
> 
> -- 
> 2.43.0
> 

Applied to noble:linux-gcp master-next branch.



More information about the kernel-team mailing list