Re: [PATCH v2 3/3] nvme: Enable autonomous power state transitions

From: J Freyensee
Date: Fri Sep 02 2016 - 17:15:25 EST


On Tue, 2016-08-30 at 14:59 -0700, Andy Lutomirski wrote:
> NVME devices can advertise multiple power states.ÂÂThese states can
> be either "operational" (the device is fully functional but possibly
> slow) or "non-operational" (the device is asleep until woken up).
> Some devices can automatically enter a non-operational state when
> idle for a specified amount of time and then automatically wake back
> up when needed.
>
> The hardware configuration is a table.ÂÂFor each state, an entry in
> the table indicates the next deeper non-operational state, if any,
> to autonomously transition to and the idle time required before
> transitioning.
>
> This patch teaches the driver to program APST so that each
> successive non-operational state will be entered after an idle time
> equal to 100% of the total latency (entry plus exit) associated with
> that state.ÂÂA sysfs attribute 'apst_max_latency_us' gives the
> maximum acceptable latency in ns; non-operational states with total
> latency greater than this value will not be used.ÂÂAs a special
> case, apst_max_latency_us=0 will disable APST entirely.

May I ask a dumb question?

How does this work with multiple NVMe devices plugged into a system? ÂI
would have thought we'd want oneÂapst_max_latency_us entry per NVMe
controller for individual control of each device? ÂI have two
Fultondale-class devices plugged into a system I tried these patches on
(the 4.8-rc4 kernel) and I'm not sure how the single
/sys/module/nvme_core/parameters/apst_max_latency_us would work per my
2 devices (and the value is using the default 25000).

Now fromÂ
nvme id-ctrl /dev/nvme0 (or nvme1)

NVME Identify Controller:
vidÂÂÂÂÂ: 0x8086
ssvidÂÂÂ: 0x8086
snÂÂÂÂÂÂ: CVFT41720018800HGNÂÂ
mnÂÂÂÂÂÂ: INTEL SSDPE2MD800G4ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ
frÂÂÂÂÂÂ: 8DV10151
rabÂÂÂÂÂ: 0
ieeeÂÂÂÂ: 5cd2e4
cmicÂÂÂÂ: 0
mdtsÂÂÂÂ: 5
cntlidÂÂ: 0
verÂÂÂÂÂ: 0
rtd3rÂÂÂ: 0
rtd3eÂÂÂ: 0
oaesÂÂÂÂ: 0
oacsÂÂÂÂ: 0x6
aclÂÂÂÂÂ: 3
aerlÂÂÂÂ: 3
frmwÂÂÂÂ: 0x2
lpaÂÂÂÂÂ: 0x2
elpeÂÂÂÂ: 63
npssÂÂÂÂ: 0
avsccÂÂÂ: 0
apstaÂÂÂ: 0 <-----

the Fultondales don't support apst.

But I'd still like to ask the dumb question :-).

>
> On hardware without APST support, apst_max_latency_us will not be
> exposed in sysfs.

Not sure that is true, as from what I see so far, Fultondales don't
support apst yet I still see:

[root@nvme-fabric-host01 nvme-cli]# cat
/sys/module/nvme_core/parameters/apst_max_latency_us
25000

>
> In theory, the device can expose "default" APST table, but this
> doesn't seem to function correctly on my device (Samsung 950), nor
> does it seem particularly useful.ÂÂThere is also an optional
> mechanism by which a configuration can be "saved" so it will be
> automatically loaded on reset.ÂÂThis can be configured from
> userspace, but it doesn't seem useful to support in the driver.
>
> On my laptop, enabling APST seems to save nearly 1W.
>
> The hardware tables can be decoded in userspace with nvme-cli.
> 'nvme id-ctrl /dev/nvmeN' will show the power state table and
> 'nvme get-feature -f 0x0c -H /dev/nvme0' will show the current APST
> configuration.

nvme get-feature -f 0x0c -H /dev/nvme0

isn't working for me, I get a:

[root@nvme-fabric-host01 nvme-cli]# ./nvme get-feature -f 0x0c -H
/dev/nvme0
NVMe Status:INVALID_FIELD(2)

I don't have the time right now to investigate further, but I'll assume
it's because I have Fultondales (though I would have thought this patch
would have provided enough code for the latest nvme-cli code to do this
new get-feature as-is).

Jay