Re: [Bug] nvme blocks PC10 since v5.15 - bisected

From: Keith Busch
Date: Thu Feb 10 2022 - 09:56:44 EST


On Thu, Jan 27, 2022 at 08:02:07PM +0100, Rafael J. Wysocki wrote:
> On Fri, Jan 21, 2022 at 10:09 PM Keith Busch <kbusch@xxxxxxxxxx> wrote:
> >
> > On Fri, Jan 21, 2022 at 08:00:49PM +0100, Rafael J. Wysocki wrote:
> > > Hi Keith,
> > >
> > > It is reported that the following commit
> > >
> > > commit e5ad96f388b765fe6b52f64f37e910c0ba4f3de7
> > > Author: Keith Busch <kbusch@xxxxxxxxxx>
> > > Date: Tue Jul 27 09:40:44 2021 -0700
> > >
> > > nvme-pci: disable hmb on idle suspend
> > >
> > > An idle suspend may or may not disable host memory access from devices
> > > placed in low power mode. Either way, it should always be safe to
> > > disable the host memory buffer prior to entering the low power mode, and
> > > this should also always be faster than a full device shutdown.
> > >
> > > Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx>
> > > Reviewed-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
> > > Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> > >
> > > is the source of a serious power regression occurring since 5.15
> > > (please see https://bugzilla.kernel.org/show_bug.cgi?id=215467).
> > >
> > > After this commit, the SoC on the affected system cannot enter
> > > C-states deeper than PC2 while suspended to idle which basically
> > > defeats the purpose of suspending.
> > >
> > > What may be happening is that nvme_disable_prepare_reset() that is not
> > > called any more in the ndev->nr_host_mem_descs case somehow causes the
> > > LTR of the device to change to "no requirement" which allows deeper
> > > C-states to be entered.
> > >
> > > Can you have a look at this, please?
> >
> > I thought platforms that wanted full device shutdown behaviour would
> > always set acpi_storage_d3. Is that not happening here?
>
> Evidently, it isn't.

Apparently it works fine when you disable VMD, so sounds like the
acpi_storage_d3 is set, but we fail to find the correct acpi companion
device when it's in a VMD domain.