[Jammy][PULL] Intel: enable x86 AMX

Andrea Righi andrea.righi at canonical.com
Mon Apr 4 14:31:19 UTC 2022


BugLink: https://bugs.launchpad.net/bugs/1967750

[Impact]

Enable AMX (aka TMUL) new instructions on the 5.15 kernel.

[Test case]

Tests have been performed directly by Intel.

[Fix]

Apply the following upstream commits (most of them are clean
cherry-picks, except 4 of them that require small context adjustment):

20df73756148 ("selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test")
063452fd94d1 ("x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation")
fa31a4d669bd ("x86/cpufeatures: Put the AMX macros in the word 18 block")
6c3118c32129 ("signal: Skip the altstack update when not needed")
52d0b8b18776 ("x86/fpu/signal: Initialize sw_bytes in save_xstate_epilog()")
d7a9590f608d ("Documentation/x86: Add documentation for using dynamic XSTATE features")
101c669d165d ("selftests/x86/amx: Add context switch test")
6a3e0651b4a0 ("selftests/x86/amx: Add test cases for AMX state management")
2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
db3e7321b4b8 ("x86/fpu: Add XFD handling for dynamic states")
2ae996e0c1a3 ("x86/fpu: Calculate the default sizes independently")
eec2113eabd9 ("x86/fpu/amx: Define AMX state components and have it used for boot-time checks")
70c3f1671b0c ("x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers")
500afbf645a0 ("x86/fpu/xstate: Add fpstate_realloc()/free()")
783e87b40495 ("x86/fpu/xstate: Add XFD #NM handler")
672365477ae8 ("x86/fpu: Update XFD state where required")
5529acf47ec3 ("x86/fpu: Add sanity checks for XFD")
8bf26758ca96 ("x86/fpu: Add XFD state to fpstate")
dae1bd583896 ("x86/msr-index: Add MSRs for XFD")
c351101678ce ("x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit")
e61d6310a0f8 ("x86/fpu: Reset permission and fpstate on exec()")
9e798e9aa14c ("x86/fpu: Prepare fpu_clone() for dynamically enabled features")
53599b4d54b9 ("x86/fpu/signal: Prepare for variable sigframe length")
4b7ca609a33d ("x86/signal: Use fpu::__state_user_size for sigalt stack validation")
23686ef25d4a ("x86/fpu: Add basic helpers for dynamically enabled features")
db8268df0983 ("x86/arch_prctl: Add controls for dynamic XSTATE components")
c33f0a81a2cf ("x86/fpu: Add fpu_state_config::legacy_features")
6f6a7c09c406 ("x86/fpu: Add members to struct fpu to cache permission information")
84e4dccc8fce ("x86/fpu/xstate: Provide xstate_calculate_size()")
3aac3ebea08f ("x86/signal: Implement sigaltstack size validation")
1bdda24c4af6 ("signal: Add an optional check for altstack size")
582b01b6ab27 ("x86/fpu: Remove old KVM FPU interface")
d69c1382e1b7 ("x86/kvm: Convert FPU handling to a single swap buffer")
69f6ed1d14c6 ("x86/fpu: Provide infrastructure for KVM FPU cleanup")
75c52dad5e32 ("x86/fpu: Prepare for sanitizing KVM FPU code")
d72c87018d00 ("x86/fpu/xstate: Move remaining xfeature helpers to core")
eda32f4f93b4 ("x86/fpu: Rework restore_regs_from_fpstate()")
daddee247319 ("x86/fpu: Mop up xfeatures_mask_uabi()")
1c253ff2287f ("x86/fpu: Move xstate feature masks to fpu_*_cfg")
2bd264bce238 ("x86/fpu: Move xstate size to fpu_*_cfg")
cd9ae7617449 ("x86/fpu/xstate: Cleanup size calculations")
617473acdfe4 ("x86/fpu: Cleanup fpu__init_system_xstate_size_legacy()")
578971f4e228 ("x86/fpu: Provide struct fpu_config")
5509cc78080d ("x86/fpu/signal: Use fpstate for size and features")
49e4eb4125d5 ("x86/fpu/xstate: Use fpstate for copy_uabi_to_xstate()")
3ac8d75778fc ("x86/fpu: Use fpstate in __copy_xstate_to_uabi_buf()")
ad6ede407aae ("x86/fpu: Use fpstate in fpu_copy_kvm_uabi_to_fpstate()")
0b2d39aa0357 ("x86/fpu/xstate: Use fpstate for xsave_to_user_sigframe()")
073e627a4537 ("x86/fpu/xstate: Use fpstate for os_xsave()")
be31dfdfd75b ("x86/fpu: Use fpstate::size")
248452ce21ae ("x86/fpu: Add size and mask information to fpstate")
2dd8eedc80b1 ("x86/process: Move arch_thread_struct_whitelist() out of line")
f0cbc8b3cdf7 ("x86/fpu: Do not leak fpstate pointer on fork")
2f27b5034244 ("x86/fpu: Remove fpu::state")
63d6bdf36ce1 ("x86/math-emu: Convert to fpstate")
c20942ce5128 ("x86/fpu/core: Convert to fpstate")
7e049e8b7459 ("x86/fpu/signal: Convert to fpstate")
caee31a36c33 ("x86/fpu/regset: Convert to fpstate")
cceb496420fa ("x86/fpu: Convert tracing to fpstate")
1c57572d754f ("x86/KVM: Convert to fpstate")
087df48c298c ("x86/fpu: Replace KVMs xstate component clearing")
18b3fa1ad15f ("x86/fpu: Convert restore_fpregs_from_fpstate() to struct fpstate")
f83ac56acdad ("x86/fpu: Convert fpstate_init() to struct fpstate")
87d0e5be0fac ("x86/fpu: Provide struct fpstate")
bf5d00470787 ("x86/fpu: Replace KVMs home brewed FPU copy to user")
079ec41b22b9 ("x86/fpu: Provide a proper function for ex_handler_fprestore()")
b56d2795b297 ("x86/fpu: Replace the includes of fpu/internal.h")
6415bb809263 ("x86/fpu: Mop up the internal.h leftovers")
ff0c37e191f2 ("x86/sev: Include fpu/xcr.h")
0ae67cc34f76 ("x86/fpu: Remove internal.h dependency from fpu/signal.h")
90489f1dee8b ("x86/fpu: Move fpstate functions to api.h")
d9d005f32aac ("x86/fpu: Move mxcsr related code to core")
9848fb96839b ("x86/fpu: Move fpregs_restore_userregs() to core")
cdcb6fa14e14 ("x86/fpu: Make WARN_ON_FPU() private")
34002571cb41 ("x86/fpu: Move legacy ASM wrappers to core")
df95b0f1aa56 ("x86/fpu: Move os_xsave() and os_xrstor() to core")
b579d0c3750e ("x86/fpu: Make os_xrstor_booting() private")
d06241f52cfe ("x86/fpu: Clean up CPU feature tests")
63e81807c1f9 ("x86/fpu: Move context switch and exit to user inlines into sched.h")
9603445549da ("x86/fpu: Mark fpu__init_prepare_fx_sw_frame() as __init")
ca834defd33b ("x86/fpu: Rework copy_xstate_to_uabi_buf()")
ea4d6938d4c0 ("x86/fpu: Replace KVMs home brewed FPU copy from user")
a0ff0611c2fb ("x86/fpu: Move KVMs FPU swapping to FPU core")
63cf05a19a5d ("x86/fpu/xstate: Mark all init only functions __init")
ffd3e504c9e0 ("x86/fpu/xstate: Provide and use for_each_xfeature()")
126fe0401883 ("x86/fpu: Cleanup xstate xcomp_bv initialization")
509e7a30cd0a ("x86/fpu: Do not inherit FPU context for kernel and IO worker threads")
2d16a1876f20 ("x86/process: Clone FPU in copy_thread()")
01f9f62d3ae7 ("x86/fpu: Remove pointless memset in fpu_clone()")
dc2f39fd1bf2 ("x86/fpu: Cleanup the on_boot_cpu clutter")
f5daf836f292 ("x86/fpu: Restrict xsaves()/xrstors() to independent states")
b50854eca0e0 ("x86/pkru: Remove useless include")
d2d926482cdf ("x86/fpu: Update stale comments")
9568bfb4f04b ("x86/fpu: Remove pointless argument from switch_fpu_finish()")
724fc0248d45 ("x86/fpu/signal: Fix missed conversion to correct boolean retval in save_xstate_epilog()")
a2a8fd9a3efd ("x86/fpu/signal: Change return code of restore_fpregs_from_user() to boolean")
be0040144152 ("x86/fpu/signal: Change return code of check_xstate_in_sigframe() to boolean")
1193f408cd51 ("x86/fpu/signal: Change return type of __fpu_restore_sig() to boolean")
f3305be5feec ("x86/fpu/signal: Change return type of fpu__restore_sig() to boolean")
ee4ecdfbd289 ("x86/signal: Change return type of restore_sigcontext() to boolean")
2af07f3a6e9f ("x86/fpu/signal: Change return type of copy_fpregs_to_sigframe() helpers to boolean")
052adee66828 ("x86/fpu/signal: Change return type of copy_fpstate_to_sigframe() to boolean")
fcfb7163329c ("x86/fpu/signal: Move xstate clearing out of copy_fpregs_to_sigframe()")
4164a482a5d9 ("x86/fpu/signal: Move header zeroing out of xsave_to_user_sigframe()")
4339d0c63c2d ("x86/fpu/signal: Clarify exception handling in restore_fpregs_from_user()")
0c2e62ba04cd ("x86/extable: Remove EX_TYPE_FAULT from MCE safe fixups")
c6304556f3ae ("x86/fpu: Use EX_TYPE_FAULT_MCE_SAFE for exception fixups")
c1c97d175493 ("x86/copy_mc: Use EX_TYPE_DEFAULT_MCE_SAFE for exception fixups")
2cadf5248b93 ("x86/extable: Provide EX_TYPE_DEFAULT_MCE_SAFE and EX_TYPE_FAULT_MCE_SAFE")
46d28947d987 ("x86/extable: Rework the exception table mechanics")
083b32d6f4fa ("x86/mce: Get rid of stray semicolons")
e42404afc4ca ("x86/mce: Deduplicate exception handling")
32fd8b59f91f ("x86/extable: Get rid of redundant macros")
326b567f82df ("x86/extable: Tidy up redundant handler functions")

[Regression potential]

The changes are limited to x86, mostly fpu code and signal handling, so
we may see regressions on x86, especially on FPU-intensive workloads.

--
The following changes since commit f4a9abe17854fc753c84a0ba4ac275e715a008f3:

  UBUNTU: Ubuntu-5.15.0-25.25 (2022-03-30 17:28:11 +0200)

are available in the Git repository at:

  git://git.launchpad.net/~arighi/+git/intel-amx tags/intel-amx

for you to fetch changes up to 04530176ef2a39e7aa18279b371ee5c2f2ee4c4f:

  selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test (2022-04-04 15:11:31 +0200)

----------------------------------------------------------------
Anders Roxell (1):
      x86/fpu/signal: Fix missed conversion to correct boolean retval in save_xstate_epilog()

Chang S. Bae (20):
      x86/fpu/xstate: Provide xstate_calculate_size()
      x86/arch_prctl: Add controls for dynamic XSTATE components
      x86/fpu/signal: Prepare for variable sigframe length
      x86/fpu: Reset permission and fpstate on exec()
      x86/cpufeatures: Add eXtended Feature Disabling (XFD) feature bit
      x86/msr-index: Add MSRs for XFD
      x86/fpu: Add XFD state to fpstate
      x86/fpu: Update XFD state where required
      x86/fpu/xstate: Add XFD #NM handler
      x86/fpu/xstate: Add fpstate_realloc()/free()
      x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers
      x86/fpu/amx: Define AMX state components and have it used for boot-time checks
      x86/fpu: Calculate the default sizes independently
      x86/fpu: Add XFD handling for dynamic states
      x86/fpu/amx: Enable the AMX feature in 64-bit mode
      selftests/x86/amx: Add test cases for AMX state management
      selftests/x86/amx: Add context switch test
      Documentation/x86: Add documentation for using dynamic XSTATE features
      signal: Skip the altstack update when not needed
      selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test

Jim Mattson (1):
      x86/cpufeatures: Put the AMX macros in the word 18 block

Marco Elver (1):
      x86/fpu/signal: Initialize sw_bytes in save_xstate_epilog()

Thomas Gleixner (90):
      x86/extable: Tidy up redundant handler functions
      x86/extable: Get rid of redundant macros
      x86/mce: Deduplicate exception handling
      x86/mce: Get rid of stray semicolons
      x86/extable: Rework the exception table mechanics
      x86/extable: Provide EX_TYPE_DEFAULT_MCE_SAFE and EX_TYPE_FAULT_MCE_SAFE
      x86/copy_mc: Use EX_TYPE_DEFAULT_MCE_SAFE for exception fixups
      x86/fpu: Use EX_TYPE_FAULT_MCE_SAFE for exception fixups
      x86/extable: Remove EX_TYPE_FAULT from MCE safe fixups
      x86/fpu/signal: Clarify exception handling in restore_fpregs_from_user()
      x86/fpu/signal: Move header zeroing out of xsave_to_user_sigframe()
      x86/fpu/signal: Move xstate clearing out of copy_fpregs_to_sigframe()
      x86/fpu/signal: Change return type of copy_fpstate_to_sigframe() to boolean
      x86/fpu/signal: Change return type of copy_fpregs_to_sigframe() helpers to boolean
      x86/signal: Change return type of restore_sigcontext() to boolean
      x86/fpu/signal: Change return type of fpu__restore_sig() to boolean
      x86/fpu/signal: Change return type of __fpu_restore_sig() to boolean
      x86/fpu/signal: Change return code of check_xstate_in_sigframe() to boolean
      x86/fpu/signal: Change return code of restore_fpregs_from_user() to boolean
      x86/fpu: Remove pointless argument from switch_fpu_finish()
      x86/fpu: Update stale comments
      x86/pkru: Remove useless include
      x86/fpu: Restrict xsaves()/xrstors() to independent states
      x86/fpu: Cleanup the on_boot_cpu clutter
      x86/fpu: Remove pointless memset in fpu_clone()
      x86/process: Clone FPU in copy_thread()
      x86/fpu: Do not inherit FPU context for kernel and IO worker threads
      x86/fpu: Cleanup xstate xcomp_bv initialization
      x86/fpu/xstate: Provide and use for_each_xfeature()
      x86/fpu/xstate: Mark all init only functions __init
      x86/fpu: Move KVMs FPU swapping to FPU core
      x86/fpu: Replace KVMs home brewed FPU copy from user
      x86/fpu: Rework copy_xstate_to_uabi_buf()
      x86/fpu: Mark fpu__init_prepare_fx_sw_frame() as __init
      x86/fpu: Move context switch and exit to user inlines into sched.h
      x86/fpu: Clean up CPU feature tests
      x86/fpu: Make os_xrstor_booting() private
      x86/fpu: Move os_xsave() and os_xrstor() to core
      x86/fpu: Move legacy ASM wrappers to core
      x86/fpu: Make WARN_ON_FPU() private
      x86/fpu: Move fpregs_restore_userregs() to core
      x86/fpu: Move mxcsr related code to core
      x86/fpu: Move fpstate functions to api.h
      x86/fpu: Remove internal.h dependency from fpu/signal.h
      x86/sev: Include fpu/xcr.h
      x86/fpu: Mop up the internal.h leftovers
      x86/fpu: Replace the includes of fpu/internal.h
      x86/fpu: Provide a proper function for ex_handler_fprestore()
      x86/fpu: Replace KVMs home brewed FPU copy to user
      x86/fpu: Provide struct fpstate
      x86/fpu: Convert fpstate_init() to struct fpstate
      x86/fpu: Convert restore_fpregs_from_fpstate() to struct fpstate
      x86/fpu: Replace KVMs xstate component clearing
      x86/KVM: Convert to fpstate
      x86/fpu: Convert tracing to fpstate
      x86/fpu/regset: Convert to fpstate
      x86/fpu/signal: Convert to fpstate
      x86/fpu/core: Convert to fpstate
      x86/math-emu: Convert to fpstate
      x86/fpu: Remove fpu::state
      x86/fpu: Do not leak fpstate pointer on fork
      x86/process: Move arch_thread_struct_whitelist() out of line
      x86/fpu: Add size and mask information to fpstate
      x86/fpu: Use fpstate::size
      x86/fpu/xstate: Use fpstate for os_xsave()
      x86/fpu/xstate: Use fpstate for xsave_to_user_sigframe()
      x86/fpu: Use fpstate in fpu_copy_kvm_uabi_to_fpstate()
      x86/fpu: Use fpstate in __copy_xstate_to_uabi_buf()
      x86/fpu/xstate: Use fpstate for copy_uabi_to_xstate()
      x86/fpu/signal: Use fpstate for size and features
      x86/fpu: Provide struct fpu_config
      x86/fpu: Cleanup fpu__init_system_xstate_size_legacy()
      x86/fpu/xstate: Cleanup size calculations
      x86/fpu: Move xstate size to fpu_*_cfg
      x86/fpu: Move xstate feature masks to fpu_*_cfg
      x86/fpu: Mop up xfeatures_mask_uabi()
      x86/fpu: Rework restore_regs_from_fpstate()
      x86/fpu/xstate: Move remaining xfeature helpers to core
      x86/fpu: Prepare for sanitizing KVM FPU code
      x86/fpu: Provide infrastructure for KVM FPU cleanup
      x86/kvm: Convert FPU handling to a single swap buffer
      x86/fpu: Remove old KVM FPU interface
      signal: Add an optional check for altstack size
      x86/signal: Implement sigaltstack size validation
      x86/fpu: Add members to struct fpu to cache permission information
      x86/fpu: Add fpu_state_config::legacy_features
      x86/fpu: Add basic helpers for dynamically enabled features
      x86/signal: Use fpu::__state_user_size for sigalt stack validation
      x86/fpu: Prepare fpu_clone() for dynamically enabled features
      x86/fpu: Add sanity checks for XFD

Yang Zhong (1):
      x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation

 Documentation/admin-guide/kernel-parameters.txt |   9 +
 Documentation/x86/index.rst                     |   1 +
 Documentation/x86/xstate.rst                    |  65 ++
 arch/Kconfig                                    |   3 +
 arch/x86/Kconfig                                |  17 +
 arch/x86/events/perf_event.h                    |   1 +
 arch/x86/ia32/ia32_signal.c                     |  15 +-
 arch/x86/include/asm/asm.h                      |  50 +-
 arch/x86/include/asm/cpufeatures.h              |   4 +
 arch/x86/include/asm/extable.h                  |  44 +-
 arch/x86/include/asm/extable_fixup_types.h      |  22 +
 arch/x86/include/asm/fpu/api.h                  |  58 +-
 arch/x86/include/asm/fpu/internal.h             | 540 --------------
 arch/x86/include/asm/fpu/sched.h                |  68 ++
 arch/x86/include/asm/fpu/signal.h               |  13 +-
 arch/x86/include/asm/fpu/types.h                | 214 +++++-
 arch/x86/include/asm/fpu/xcr.h                  |  11 -
 arch/x86/include/asm/fpu/xstate.h               |  90 +--
 arch/x86/include/asm/kvm_host.h                 |   7 +-
 arch/x86/include/asm/msr-index.h                |   2 +
 arch/x86/include/asm/msr.h                      |   4 +-
 arch/x86/include/asm/pkru.h                     |   2 +-
 arch/x86/include/asm/processor.h                |   9 +-
 arch/x86/include/asm/proto.h                    |   2 +-
 arch/x86/include/asm/segment.h                  |   2 +-
 arch/x86/include/asm/trace/fpu.h                |   4 +-
 arch/x86/include/uapi/asm/prctl.h               |   4 +
 arch/x86/kernel/cpu/bugs.c                      |   2 +-
 arch/x86/kernel/cpu/common.c                    |   2 +-
 arch/x86/kernel/cpu/cpuid-deps.c                |   2 +
 arch/x86/kernel/cpu/mce/core.c                  |  40 +-
 arch/x86/kernel/cpu/mce/internal.h              |  14 +-
 arch/x86/kernel/cpu/mce/severity.c              |  22 +-
 arch/x86/kernel/fpu/bugs.c                      |   2 +-
 arch/x86/kernel/fpu/context.h                   |  83 +++
 arch/x86/kernel/fpu/core.c                      | 391 +++++++++--
 arch/x86/kernel/fpu/init.c                      |  76 +-
 arch/x86/kernel/fpu/internal.h                  |  28 +
 arch/x86/kernel/fpu/legacy.h                    | 115 +++
 arch/x86/kernel/fpu/regset.c                    |  36 +-
 arch/x86/kernel/fpu/signal.c                    | 285 ++++----
 arch/x86/kernel/fpu/xstate.c                    | 898 +++++++++++++++++++-----
 arch/x86/kernel/fpu/xstate.h                    | 278 ++++++++
 arch/x86/kernel/process.c                       |  27 +-
 arch/x86/kernel/process_32.c                    |   5 +-
 arch/x86/kernel/process_64.c                    |   5 +-
 arch/x86/kernel/ptrace.c                        |   2 +-
 arch/x86/kernel/sev.c                           |   2 +-
 arch/x86/kernel/signal.c                        |  83 ++-
 arch/x86/kernel/smpboot.c                       |   2 +-
 arch/x86/kernel/traps.c                         |  40 +-
 arch/x86/kvm/svm/sev.c                          |   2 +-
 arch/x86/kvm/svm/svm.c                          |   7 +-
 arch/x86/kvm/vmx/vmx.c                          |   2 +-
 arch/x86/kvm/x86.c                              | 258 +------
 arch/x86/lib/copy_mc_64.S                       |   8 +-
 arch/x86/math-emu/fpu_aux.c                     |   2 +-
 arch/x86/math-emu/fpu_entry.c                   |   6 +-
 arch/x86/math-emu/fpu_system.h                  |   2 +-
 arch/x86/mm/extable.c                           | 135 ++--
 arch/x86/net/bpf_jit_comp.c                     |  11 +-
 arch/x86/power/cpu.c                            |   2 +-
 include/linux/signal.h                          |   6 +
 kernel/signal.c                                 |  44 +-
 scripts/sorttable.c                             |   4 +-
 tools/testing/selftests/x86/Makefile            |   2 +-
 tools/testing/selftests/x86/amx.c               | 863 +++++++++++++++++++++++
 67 files changed, 3480 insertions(+), 1575 deletions(-)
 create mode 100644 Documentation/x86/xstate.rst
 create mode 100644 arch/x86/include/asm/extable_fixup_types.h
 create mode 100644 arch/x86/include/asm/fpu/sched.h
 create mode 100644 arch/x86/kernel/fpu/context.h
 create mode 100644 arch/x86/kernel/fpu/internal.h
 create mode 100644 arch/x86/kernel/fpu/legacy.h
 create mode 100644 arch/x86/kernel/fpu/xstate.h
 create mode 100644 tools/testing/selftests/x86/amx.c



More information about the kernel-team mailing list