ACK/Cmnt: [SRU v2] [G] [PATCH 0/6] Prevent thermal shutdown during boot process

Kai-Heng Feng kai.heng.feng at canonical.com
Wed Jan 27 11:58:26 UTC 2021


On Wed, Jan 27, 2021 at 4:30 PM Stefan Bader <stefan.bader at canonical.com> wrote:
>
> On 27.01.21 04:05, Kai-Heng Feng wrote:
> > BugLink: https://bugs.launchpad.net/bugs/1906168
> >
> > [Impact]
> > Surprising thermal shutdown at boot on Intel based mobile workstations.
> >
> > [Fix]
> > Since these thermal devcies are not in ACPI ThermalZone, OS shouldn't
> > shutdown the system.
> >
> > These critial temperatures are for usespace to handle, so let kernel
> > know it shouldn't handle it.
> >
> > For Groovy, a patch that removes .notify callback is dropped.
> >
> > [Test]
> > Use reboot stress as a reproducer. 5% chance to see a surprising
> > shutdown at boot.
> >
> > With the fix applied, the thermal shutdown is no longer reproducible.
> >
> > [Where problems could occur]
> > For ACPI based platforms, we still have "acpitz" to protect systems from
> > overheating. If these acpitz sensors don't work, then the system could
> > face real overheating issue.
> >
> > Daniel Lezcano (4):
> >   thermal/core: Emit a warning if the thermal zone is updated without
> >     ops
> >   thermal/core: Add critical and hot ops
> >   thermal/drivers/acpi: Use hot and critical ops
> >   thermal/drivers/rcar: Remove notification usage
> >
> > Kai-Heng Feng (2):
> >   thermal: int340x: Fix unexpected shutdown at critical temperature
> >   thermal: intel: pch: Fix unexpected shutdown at critical temperature
> >
> >  drivers/acpi/thermal.c                        | 30 ++++++------
> >  .../int340x_thermal/int340x_thermal_zone.c    |  6 +++
> >  drivers/thermal/intel/intel_pch_thermal.c     |  6 +++
> >  drivers/thermal/rcar_thermal.c                | 19 --------
> >  drivers/thermal/thermal_core.c                | 46 ++++++++++++-------
> >  include/linux/thermal.h                       |  3 ++
> >  6 files changed, 58 insertions(+), 52 deletions(-)
> >
>
> The actual fix seem to be patches #5 and #6. The other patches add some new
> hooks which the fixes depend on or rate limit some warning (probably not
> strictly needed but makes things more bearable). Just to double check on patch
> #4: is that still needed now that notify remains? I don't think it is fatal to
> remove the message but I wonder whether there would really be anything else
> issuing the warning without adding a critical hook.

The patch does nothing, so I'd like to keep as close to upstream as possible.
So I prefer not to drop it.

Kai-Heng

>
> Povisional Ack for the set but should clarify the need for patch #4.
>
> Acked-by: Stefan Bader <stefan.bader at canonical.com>
>
> Thanks,
> Stefan
>



More information about the kernel-team mailing list