APPLIED: [SRU][Impish][PATCH 0/1] amd_sfh: Null pointer dereference on early device init causes early panic and fails to boot

Kleber Souza kleber.souza at canonical.com
Thu Jan 13 11:22:15 UTC 2022


On 13.01.22 00:41, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1956519
>
> [Impact]
>
> A regression was introduced into 5.13.0-23-generic for devices using AMD Ryzen
> chipsets that incorporate AMD Sensor Fusion Hub (SFH) HID devices, which are
> mostly Ryzen based laptops, but desktops do have the SOC embedded as well.
>
> On early boot, when the driver initialises the device, it hits a null pointer
> dereference with the following stack trace:
>
> BUG: kernel NULL pointer dereference, address: 000000000000000c
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 0 P4D 0
> Oops: 0002 [#1] SMP NOPTI
> CPU: 0 PID: 175 Comm: systemd-udevd Not tainted 5.13.0-23-generic #23-Ubuntu
> RIP: 0010:amd_sfh_hid_client_init+0x47/0x350 [amd_sfh]
> Call Trace:
>    ? __pci_set_master+0x5f/0xe0
>    amd_mp2_pci_probe+0xad/0x160 [amd_sfh]
>    local_pci_probe+0x48/0x80
>    pci_device_probe+0x105/0x1c0
>    really_probe+0x24b/0x4c0
>    driver_probe_device+0xf0/0x160
>    device_driver_attach+0xab/0xb0
>    __driver_attach+0xb2/0x140
>    ? device_driver_attach+0xb0/0xb0
>    bus_for_each_dev+0x7e/0xc0
>    driver_attach+0x1e/0x20
>    bus_add_driver+0x135/0x1f0
>    driver_register+0x95/0xf0
>    ? 0xffffffffc03d2000
>    __pci_register_driver+0x57/0x60
>    amd_mp2_pci_driver_init+0x23/0x1000 [amd_sfh]
>    do_one_initcall+0x48/0x1d0
>    ? kmem_cache_alloc_trace+0xfb/0x240
>    do_init_module+0x62/0x290
>    load_module+0xa8f/0xb10
>    __do_sys_finit_module+0xc2/0x120
>    __x64_sys_finit_module+0x18/0x20
>    do_syscall_64+0x61/0xb0
>    ? ksys_mmap_pgoff+0x135/0x260
>    ? exit_to_user_mode_prepare+0x37/0xb0
>    ? syscall_exit_to_user_mode+0x27/0x50
>    ? __x64_sys_mmap+0x33/0x40
>    ? do_syscall_64+0x6e/0xb0
>    ? do_syscall_64+0x6e/0xb0
>    ? do_syscall_64+0x6e/0xb0
>    ? syscall_exit_to_user_mode+0x27/0x50
>    ? do_syscall_64+0x6e/0xb0
>    ? exc_page_fault+0x8f/0x170
>    ? asm_exc_page_fault+0x8/0x30
>    entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> This causes a panic and the system is unable to continue booting, and the user
> must select an older kernel to boot.
>
> [Fix]
>
> The issue was introduced in 5.13.0-23-generic by the commit:
>
> commit d46ef750ed58cbeeba2d9a55c99231c30a172764
> commit-impish 56559d7910e704470ad72da58469b5588e8cbf85
> Author: Evgeny Novikov <novikov at ispras.ru>
> Date: Tue Jun 1 19:38:01 2021 +0300
> Subject:HID: amd_sfh: Fix potential NULL pointer dereference
> Link: https://github.com/torvalds/linux/commit/d46ef750ed58cbeeba2d9a55c99231c30a172764
>
> The issue is pretty straightforward, amd_sfh_client.c attempts to dereference
> cl_data, but it is NULL:
>
> $ eu-addr2line -ifae ./usr/lib/debug/lib/modules/5.13.0-23-generic/kernel/drivers/hid/amd-sfh-hid/amd_sfh.ko amd_sfh_hid_client_init+0x47
> 0x0000000000000767
> amd_sfh_hid_client_init
> /build/linux-k2e9CH/linux-5.13.0/drivers/hid/amd-sfh-hid/amd_sfh_client.c:147:27
>
> 134 int amd_sfh_hid_client_init(struct amd_mp2_dev *privdata)
> 135 {
> ...
> 146
> 147 cl_data->num_hid_devices = amd_mp2_get_sensor_num(privdata, &cl_data->sensor_idx[0]);
> 148
> ...
>
> The patch moves the call to amd_sfh_hid_client_init() before privdata->cl_data
> is actually allocated by devm_kzalloc, hence cl_data being NULL.
>
> + rc = amd_sfh_hid_client_init(privdata);
> + if (rc)
> + return rc;
> +
>          privdata->cl_data = devm_kzalloc(&pdev->dev, sizeof(struct amdtp_cl_data), GFP_KERNEL);
>          if (!privdata->cl_data)
>                  return -ENOMEM;
> ...
> - return amd_sfh_hid_client_init(privdata);
> + return 0;
>
> The issue was fixed upstream in 5.15-rc4 by the commit:
>
> commit 88a04049c08cd62e698bc1b1af2d09574b9e0aee
> Author: Basavaraj Natikar <Basavaraj.Natikar at amd.com>
> Date: Thu Sep 23 17:59:27 2021 +0530
> Subject: HID: amd_sfh: Fix potential NULL pointer dereference
> Link: https://github.com/torvalds/linux/commit/88a04049c08cd62e698bc1b1af2d09574b9e0aee
>
> The fix places the call to amd_sfh_hid_client_init() after privdata->cl_data is
> allocated, and it changes the order of amd_sfh_hid_client_init() to happen
> before devm_add_action_or_reset() fixing the actual null pointer dereference
> which caused these commits to exist.
>
> This patch also landed in 5.14.10 -stable, but it seems it was omitted from
> being backported to impish, likely due to it sharing the exact same subject line
> as the regression commit, so it was likely dropped as a duplicate?
>
> [Testcase]
>
> You need an AMD Ryzen based system that has a AMD Sensor Fusion Hub HID device
> built in to test this.
>
> Simply booting the system is enough to trigger the issue.
>
> A test kernel is available in the following ppa:
>
> https://launchpad.net/~mruffell/+archive/ubuntu/lp1956519-test
>
> A community user has tested the test kernel, and has confirmed that it fixes the
> issue.
>
> [Where problems could occur]
>
> If a regression were to occur, it would only affect AMD Ryzen based systems with
> the AMD Sensor Fusion Hub HID device SOC. Since the changes affect the device
> initialisation function, a regression could cause systems to panic during boot,
> forcing users to revert to older kernels to start their systems.
>
> Saying that, the patch is present in 5.15-rc4 and is in 5.14.10, and is in
> widespread use, and is already present in Jammy.
>
> Basavaraj Natikar (1):
>    HID: amd_sfh: Fix potential NULL pointer dereference
>
>   drivers/hid/amd-sfh-hid/amd_sfh_pcie.c | 12 ++++--------
>   1 file changed, 4 insertions(+), 8 deletions(-)
>

Applied to impish:linux.

Thanks,
Kleber




More information about the kernel-team mailing list