Thermal management

John Hupp lubuntu at prpcompany.com
Tue Jun 17 19:56:10 UTC 2014


This is a fresh attempt to summarize what I've been looking at the past 
few weeks, with a nod to Phill W's request.  People with more 
specialized expertise will likely want to correct this in any number of 
ways.  Better knowledge is very much welcome.

Fan control, especially on laptops, which was my central interest, is 
part of the larger topic of thermal management, which includes active 
and passive methods.  Fan control is the usual active method.  Passive 
methods include several technologies for throttling devices, and these 
methods have varying effects on performance.  Thermal management has 
historically focused on the CPU, but has broadened to embrace the GPU, 
hard drive, LCD screen, and the entire enclosure.  This has become 
increasingly important as form factors have shrunk.

Thermal management for a desktop is usually an easier proposition 
because the heat sources are not as crammed together, because it's 
easier to add fans if needed, and because the thermal management 
technologies most commonly used in a desktop are, in general, better 
supported, and they are more exposed by the OEM's.

The situation for laptops is, in general, the opposite of everything 
just said about desktops.

*Some further orientation and outline of the various approaches*:

BIOS (presumably similar for UEFI): This is the first thing to look at.  
The BIOS likely sets up some thermal management, and may present simple 
or sophisticated controls in its interface.  In other cases the OEM has 
decided not to present any user-configurable controls there whatsoever.  
When an OS has booted, the BIOS may relinquish all or part of thermal 
control to it (commonly via ACPI).  But it may continue to exercise 
control through SMM (System Management Mode), which temporarily suspends 
processing by the OS in a way that is transparent to it, and runs some 
SMM BIOS code.  By definition then, it can be difficult to know what, if 
anything is controlled by SMM.  And SMM is platform-dependent, so there 
doesn't seem to be a standards-based way for developers to write code 
that works across whatever makes and models use SMM.

ACPI (Advanced Configuration and Power Interface): This is the successor 
to PNP configuration and APM power management.  Under ACPI, power 
management is no longer the responsibility of the BIOS via APM, but of 
the OS.  ACPI is closely related to OSPM (Operating System-directed 
configuration and Power Management), which has been described as a 
system implementing ACPI.  Proper functioning under ACPI requires 
support by the hardware, the BIOS, and the OS.  The ACPI BIOS loads some 
ACPI tables into memory, the most prominent of which is the DSDT 
(Differentiated System Description Table).  These tables provide 
hardware enumeration data and AML (ACPI Machine Language) bytecode.  The 
OS kernel uses an interpreter (ACPICA -- ACPI Component Architecture) to 
run the bytecode and employ the data to set everything up.  [A 
development note: ACPI was merged into the UEFI Forum in 2013.]

ACPI and Sysfs: The state of the kernel is reflected to user space via a 
sysfs, a virtual filesystem mounted at /sys.  In Windows-ish terms, this 
describes what device drivers are loaded and what their settings are.  
But in addition to the original device nodes that describe the kernel 
state most concisely, there are also symlinks to many of the original 
device nodes, set up in various /sys locations, serving various 
purposes.  So for thermal management purposes, one might be instructed 
to look at the contents of /sys/class/thermal, but a number of the 
folders there are symlinks to yet other /sys directories.  Some of the 
kernel parameters reflected at /sys are writable (or are supposed to be, 
or were at one time).

Hwmon - a sysfs extension: This extension to sysfs provides alternate 
interfaces under /sys to report or control kernel parameters that may 
also be represented elsewhere in /sys.  But some user applications are 
written such that they rely exclusively on the hwmon interfaces.

ACPI and Procfs - What is now accomplished for thermal management 
purposes by the sysfs mounted at /sys, was previously accomplished by 
the procfs mounted at /proc.  So there is a lot of documentation 
regarding, for instance, /proc/acpi/fan and 
/proc/acpi/thermal_zone/*/trip_points (see 
https://wiki.ubuntu.com/DebuggingACPI), but that information is now 
obsolete.

Generic Thermal Management Framework - Much of thermal policy or 
decision-making has been handled in the OS by the kernel, but exposed to 
some degree to user control via sysfs.  The idea under this framework is 
reduce the role the kernel plays to that of a facilitator, and leave 
policy/decision-making to user-land tools. But worth noting is that such 
user tools do not fundamentally add to lower-level methods.

*SOLUTIONS* (check the standard repositories first for any additional 
packages you want to download)

BIOS: Start here.  This may be all you need to improve your fan 
control.  Sometimes an updated BIOS is required.

lm-sensors + fancontrol (http://www.lm-sensors.org): This solution 
explicitly relies on the hwmon interfaces of /sys, and it only works 
with PWM (pulse-width modulated) fan controllers, not with 
voltage-regulated controllers (and no, I don't know why on either 
count).  But its README says this:

    Laptops, on the other hand, rarely expose any hardware monitoring
    chip. They often have some BIOS and/or ACPI magic to get the CPU
    temperature value, but that's about it. For such laptops, the
    lm-sensors package is of no use (sensors-detect will not find
    anything), and you have to use acpi instead.

ACPI - Editing/creating thermal trip points:  Regarding thermal 
management, you will see documentation that commands like these should 
change the thermal trip point temperature:
$ sudo sh -c "echo 75000 > 
/sys/class/thermal/thermal_zone0/trip_point_1_temp"
or
$ echo 75000 | sudo tee trip_point_1_temp
But this does not seem to be supported by recent kernels and yielded 
"Permission denied" errors in my tests.  I have seen conflicting 
information on whether trip points should be editable.
         Concerning laptops, there is this statement at 
https://01.org/linux-acpi/documentation/debug-how-isolate-linux-acpi-issues:
"Most notebooks also use native fan control instead of ACPI. There are, 
however, a couple of notable exceptions: HP/Compaq, Acer, and 
Fujitsu-Siemens often use ACPI-based fan-control."

ACPI - Overriding the DSDT table: There is documentation (e.g. Patching 
DSDT in recent Linux kernels without recompiling 
<http://blog.michael.kuron-germany.de/2011/03/patching-dsdt-in-recent-linux-kernels-without-recompiling/>) 
about how to edit the DSDT table that the BIOS presents to the kernel, 
and then direct the kernel to use the edited table.  One wonders about 
that as a method for creating/modifying ACPI thermal trip points. I have 
not tried it.    One source with more expertise says that this *might* 
work, but notes that if a fan is controlled by SMM, this may overrule 
something set up in ACPI.

i8kutils (i8kctl 
<http://manpages.ubuntu.com/manpages/trusty/man1/i8kctl.1.html> and 
i8kmon <http://manpages.ubuntu.com/manpages/trusty/man1/i8kmon.1.html>): 
This relies on SMM, and as a platform-dependent solution, the authors 
are aiming to support only Dell laptops.  But this is probably the best 
solution for those.

thinkpad-acpi (http://www.thinkwiki.org/wiki/How_to_control_fan_speed): 
This is an extension of the ACPI support provided by the standard 
kernel.  Note that it does not support all Lenovo laptops.  Lenovo 
3000's, for instance, are not Thinkpads and are not supported by this 
extension.

Asus: See 
http://forum.notebookreview.com/asus/705656-fan-control-asus-prime-ux31-ux31a-ux32a-ux32vd.html 
and https://gist.github.com/felipec/6169047 and 
https://help.ubuntu.com/community/AsusZenbookPrime#Sensors_.28temps_.26_fans.29 
as good starting points.

Thermald: See the links following for information about the thermal 
daemon new to 14.04.  It is an implementation of the Generic Thermal 
Management Framework.  It controls cooling via
- the Running Average Power Limit (RAPL) driver (Sandybridge upwards)
- the Intel P-state CPU frequency driver (Sandybridge upwards)
- the CPU freq driver
- the Intel PowerClamp driver
- active or passive cooling devices as presented in sysfs (but it cannot 
create any new devices; if there is no FAN device, it will not control 
the fan)
https://wiki.ubuntu.com/Kernel/PowerManagement/ThermalIssues
http://manpages.ubuntu.com/manpages/trusty/en/man5/thermal-conf.xml.5.html
https://01.org/linux-thermal-daemon/documentation/introduction-thermal-daemon
http://www.linux.com/news/featured-blogs/200-libby-clark/721494-linux-thermal-daemon-monitors-and-controls-temperature-in-tablets-laptops

*MY CURRENT CASE*

I wanted the fan to start at a lower temperature on a Lenovo 3000 C200 
laptop.

The BIOS exposes no thermal settings.  But someone who previously made a 
serious attempt at this on a Lenovo 3000 N200 says that the fan is 
controlled by SMM.  This makes sense, since the fan trips at the same 
temperature in Windows or Lubuntu.  It's notable that active cooling is 
done via SMM, while passive cooling is via ACPI.

Lm-sensors: It does not find a PWM controller that it can work with.

ACPI: There are only two trip points defined, both passive, and the 
lowest of them at 87C.  There is no FAN device in /sys/class/thermal, 
and attempts to modify trip point settings resulted in "Permission 
denied."  I still wonder if editing the DSDT table might net me any 
gains, but I decided for the time being not to invest any more time, 
especially with the prospect that SMM would undo my work.

Thinkpad-acpi: It does not support the Lenovo 3000.

Thermald: I installed this with high hopes, but found out that it cannot 
control a fan if an ACPI fan device does not already exist.  And since 
this laptop does not have a SandyBridge or newer processor to take 
advantage of the RAPL or P-state drivers, thermald does not bring much 
to the table that ACPI did not already provide.

Further recourse: The only measures I can imagine now are to write 
something like i8kutils for this Lenovo platform.  Or edit the DSDT 
table just to see what happens.  I say "imagine" because I will very 
likely do neither!  One source also mentioned the prospect of 
controlling the embedded controller for the fan via hwmon, but I haven't 
seen any details on how to accomplish that apart from lm-sensors.

A small bit of consolation: I recall that Speedfan regards 50C as a good 
trip point, and thermald will try to keep the CPU under 45C.  But I have 
read that it's generally OK for laptops to run somewhat hotter than 
desktops.  This laptop fan kicks on around 68C and seems to hold the 
line pretty well, so maybe the thermal management is not as bad as I 
first thought.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/lubuntu-users/attachments/20140617/5782c1fb/attachment-0001.html>


More information about the Lubuntu-users mailing list