Re: [PATCH v2] arm64: PCI: Add quirk for Qualcomm WoA devices

From: Shawn Guo
Date: Tue May 02 2023 - 04:48:11 EST


On Fri, Apr 28, 2023 at 04:30:27PM -0500, Bjorn Helgaas wrote:
> [+cc Andy, Bjorn A, plea for help from Qualcomm firmware folks]
>
> On Sun, Apr 23, 2023 at 11:05:20AM +0800, Shawn Guo wrote:
> > Commit 8fd4391ee717 ("arm64: PCI: Exclude ACPI "consumer" resources from
> > host bridge windows") introduced a check to remove host bridge register
> > resources for all arm64 platforms, with the assumption that the PNP0A03
> > _CRS resources would always be host bridge registers and never as windows
> > on arm64 platforms.
>
> That's not quite what the commit log says. The 8fd4391ee717
> assumption is that on arm64,
>
> - _CRS *consumer* resources are host bridge registers
> - _CRS *producer* resources are windows
>
> which I think matches the intent of the ACPI spec.

Yes, I will update.

>
> > The assumption stands true until Qualcomm WoA (Windows on ARM) devices
> > emerge. These devices describe host bridge windows in PNP0A03 _CRS
> > resources instead. For example, the Microsoft Surface Pro X has host
> > bridges defined as
> >
> > Name (_CID, EisaId ("PNP0A03") /* PCI Bus */) // _CID: Compatible ID
> >
> > Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings
> > {
> > Name (RBUF, ResourceTemplate ()
> > {
> > Memory32Fixed (ReadWrite,
> > 0x60200000, // Address Base
> > 0x01DF0000, // Address Length
> > )
> > ...
>
> > The Memory32Fixed holds a host bridge window, but it's not properly
> > defined as a "producer" resource.
>
> I assume you're saying the use of Memory32Fixed for a window is a
> firmware defect, right?

Yes, I will reword.

>
> (Per ACPI r6.5, sec 19.6.83, the Memory32Fixed descriptor cannot
> specify a Producer/Consumer ResourceUsage. I think that means the
> space is assumed to be ResourceConsumer.)
>
> > Consequently the resource gets removed by kernel, and the BAR
> > allocation fails later on:
> >
> > [ 0.150731] pci 0002:00:00.0: BAR 14: no space for [mem size 0x00100000]
> > [ 0.150744] pci 0002:00:00.0: BAR 14: failed to assign [mem size 0x00100000]
> > [ 0.150758] pci 0002:01:00.0: BAR 0: no space for [mem size 0x00004000 64bit]
> > [ 0.150769] pci 0002:01:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit]
> >
> > This eventually prevents the PCIe NVME drive from being accessible.
> >
> > Add a quirk for these devices to avoid the resource being removed.
>
> Since this is a Windows laptop, I assume this works with Windows and
> that Windows will in fact assign BARs in that Memory32Fixed area.
>
> If we knew what the firmware author's intent was, we could probably
> make Linux understand it.
>
> Maybe (probably) Windows treats these descriptors the same on arm64 as
> on x86, i.e., *everything* in PNP0A03 _CRS is assumed to be "producer"
> (at least, that's my experimental observation; I have no actual
> knowledge of Windows).

That's my bet too.

>
> So I guess 8fd4391ee717 must have been motivated by some early arm64
> platform that put "consumer" descriptors in PNP0A03 _CRS as Lorenzo
> said [1].
>
> In that case I guess our choices are:
>
> - Add quirks like this and keep adding them for every new arm64
> platform that uses the same "everything in PNP0A03 _CRS is a
> producer" strategy.
>
> - Remove 8fd4391ee717, break whatever early arm64 platforms needed
> it, and add piecemeal quirks for them.
>
> I hate both, but I think I hate the first more because it has no end,
> while the second is painful but limited.

Thanks for your opinion on this! Let's try to pursue the second then.

>
> Obviously we would need to do whatever we can to identify and fix
> things that depend on 8fd4391ee717 before reverting it.

Lorenzo,

I have zero experience on any of those early arm64 platforms. I would
appreciate it if you can give some direction on how to identify them.

Looking at your comment below, I'm wondering if it's true that the
firmware on those early arm64 platforms has no MCFG table but provide
root->mcfg_addr via _CBA method?

"I believe it is because there were arm64 platforms (early) that added a
consumer descriptor in the host bridge CRS with MMIO registers space in
it (I am not sure I can find the bug report - it has been a while,
remember the issue with non-ECAM config space and where to add the MMIO
resource required to "extend" MCFG config space ? I will never forget
that :))."

It would be very helpful if we can find someone running any of those
early platforms, so that we can ask favor to dump ACPI tables and test
things out.

Shawn