[PATCH v2 0/2] [SRU Y/Z/A] Enable APST for more NVMe

Kai-Heng Feng kai.heng.feng at canonical.com
Thu Jul 6 06:56:27 UTC 2017


On Thu, Jul 6, 2017 at 12:04 AM, Seth Forshee
<seth.forshee at canonical.com> wrote:
> On Thu, Jun 29, 2017 at 06:19:50PM +0800, Kai-Heng Feng wrote:
>> BugLink: https://bugs.launchpad.net/bugs/1700919
>>
>> [Impact]
>> NVMe consumes lots of power. It may not be an issue on power cord plugged
>> devices, but it will reduce usage time on battery powered devices.
>>
>> APST is enabled but it can't find power saving state to transits to:
>>
>> $ sudo nvme get-feature -f 0x0c -H /dev/nvme0
>> get-feature:0xc (Autonomous Power State Transition), Current value:0x000001
>>         Autonomous Power State Transition Enable (APSTE): Enabled
>>         Auto PST Entries .................
>>         Entry[ 0]
>>         .................
>>         Idle Time Prior to Transition (ITPT): 0 ms
>>         Idle Transition Power State (ITPS): 0
>>         .................
>>         Entry[ 1]
>>         .................
>>         Idle Time Prior to Transition (ITPT): 0 ms
>>         Idle Transition Power State (ITPS): 0
>> ...
>>
>> [Fix]
>> The original logic is to skip power states that enlat + exlat larger than
>> certain latency limit, which is 25000 originally.
>>
>> These first patch change the logic only to consider exlat - because that's the
>> most cases that actually happen.
>>
>> The second patch relax the lantency limit - hence more power saving states can
>> be choosen to transit to.
>>
>> [Test Case]
>> APST is enabled, also there's power saving state (3) to transit to:
>>
>> $ sudo nvme get-feature -f 0x0c -H /dev/nvme0
>> get-feature:0xc (Autonomous Power State Transition), Current value:0x000001
>>         Autonomous Power State Transition Enable (APSTE): Enabled
>>         Auto PST Entries .................
>>         Entry[ 0]
>>         .................
>>         Idle Time Prior to Transition (ITPT): 500 ms
>>         Idle Transition Power State (ITPS): 3
>>         .................
>>         Entry[ 1]
>>         .................
>>         Idle Time Prior to Transition (ITPT): 500 ms
>>         Idle Transition Power State (ITPS): 3
>> ...
>>
>> [Regression Potential]
>> Low. It can be disabled by kernel parameter.
>
> A kernel parameter workaround for cases where the hardware is broken is
> nice, but it's not sufficient justification for this being low
> regression potential. It's not acceptable to force a significant number
> of users have to file bugs, search through forums, etc. to find out how
> to fix their systems which were broken by a kernel upgrade.
>
> What we need to know from you is that we can have strong confidence that
> these changes aren't going to cause problems for users. Have the
> upstream developers determined the underlying reason that APST causes
> problems with some hardware and implemented a fix? Have these patches
> been tested on a large cross section of the NVMe drives in the wild?

Upstream maintainers are aware of the issue, and vendors are
investigating the problem.
The issue is at hardware level, the way NVMe talks to PCIe bus. So it
requires either a firmware fix or a hardware fix.
I personally tested 5 different NVMes without any issue. That said, it
may mean nothing - a different NVMe/PCIe combination may still trigger
the bug.

>
>> We already enabled APST on X/Y for a while, this SRU will include more NVMes
>> to enable APST. If any regression happens, user can disable APST in kernel
>> parameter. I'll also add new quirk for those devices to upstream kernel.
>>
>> There's an incident that user can't disable APST via kernel parameter
>> (LP: #1699004), it's being solved in upstream kernel and I'll backport the
>> patch in another SRU.
>>
>> Either way, NVMe APST is enabled by default from 4.11 onward - if faulty NVMes
>> need to be quirked, it's better be sooner rather than later.
>
> You're submitting this for stable releases though. We shouldn't rush
> into enabling APST globally until we're sure that most of the drives
> which need quirks have already been found.
>
> It's a NACK from me until you've convinced me that this really is
> unlikely to cause regressions for users.

You are right. We shouldn't expect users to tweak the knob if
regression happens.

Once Artful is released and being used by more audience without major
regression, will it be a "good enough" indication to backport APST to
older kernel?




More information about the kernel-team mailing list