ACK/Cmnt: [SRU v2] [G] [PATCH 0/6] Prevent thermal shutdown during boot process

Stefan Bader stefan.bader at canonical.com
Wed Jan 27 08:30:47 UTC 2021


On 27.01.21 04:05, Kai-Heng Feng wrote:
> BugLink: https://bugs.launchpad.net/bugs/1906168
> 
> [Impact]
> Surprising thermal shutdown at boot on Intel based mobile workstations.
> 
> [Fix]
> Since these thermal devcies are not in ACPI ThermalZone, OS shouldn't
> shutdown the system.
> 
> These critial temperatures are for usespace to handle, so let kernel
> know it shouldn't handle it.
> 
> For Groovy, a patch that removes .notify callback is dropped.
> 
> [Test]
> Use reboot stress as a reproducer. 5% chance to see a surprising
> shutdown at boot.
> 
> With the fix applied, the thermal shutdown is no longer reproducible.
> 
> [Where problems could occur]
> For ACPI based platforms, we still have "acpitz" to protect systems from
> overheating. If these acpitz sensors don't work, then the system could
> face real overheating issue.
> 
> Daniel Lezcano (4):
>   thermal/core: Emit a warning if the thermal zone is updated without
>     ops
>   thermal/core: Add critical and hot ops
>   thermal/drivers/acpi: Use hot and critical ops
>   thermal/drivers/rcar: Remove notification usage
> 
> Kai-Heng Feng (2):
>   thermal: int340x: Fix unexpected shutdown at critical temperature
>   thermal: intel: pch: Fix unexpected shutdown at critical temperature
> 
>  drivers/acpi/thermal.c                        | 30 ++++++------
>  .../int340x_thermal/int340x_thermal_zone.c    |  6 +++
>  drivers/thermal/intel/intel_pch_thermal.c     |  6 +++
>  drivers/thermal/rcar_thermal.c                | 19 --------
>  drivers/thermal/thermal_core.c                | 46 ++++++++++++-------
>  include/linux/thermal.h                       |  3 ++
>  6 files changed, 58 insertions(+), 52 deletions(-)
> 

The actual fix seem to be patches #5 and #6. The other patches add some new
hooks which the fixes depend on or rate limit some warning (probably not
strictly needed but makes things more bearable). Just to double check on patch
#4: is that still needed now that notify remains? I don't think it is fatal to
remove the message but I wonder whether there would really be anything else
issuing the warning without adding a critical hook.

Povisional Ack for the set but should clarify the need for patch #4.

Acked-by: Stefan Bader <stefan.bader at canonical.com>

Thanks,
Stefan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20210127/f9934a49/attachment.sig>


More information about the kernel-team mailing list