Re: [PATCH 0/4] PCI SMC conduit, now with DT support

From: Jeremy Linton
Date: Thu Jul 28 2022 - 13:21:08 EST


Hi,

On 7/26/22 06:40, Will Deacon wrote:
On Mon, Jul 25, 2022 at 11:39:01AM -0500, Jeremy Linton wrote:
This is a rebase of the later revisions of [1], but refactored
slightly to add a DT method as well. It has all the same advantages of
the ACPI method (putting HW quirks in the firmware rather than the
kernel) but now applied to a 'pci-host-smc-generic' compatible
property which extends the pci-host-generic logic to handle cases
where the PCI Config region isn't ECAM compliant. With this in place,
and firmware managed clock/phy/etc its possible to run the generic
driver on hardware that isn't what one would consider standards
compliant PCI root ports.

I still think that hiding the code in firmware because the hardware is
broken is absolutely the wrong way to tackle this problem and I thought
the general idea from last time was that we were going to teach Linux
about the broken hardware instead [1]. I'd rather have the junk where we
can see it, reason about it and modify it.

Well, the CM4/ACPI/PCIe quirk still hasn't landed, but that's not the point.

I would like to understand why you think this patch is any different than the dozens of other firmware traps, quite a number merged in the last year, for "broken" hardware or simply as generic platform interfaces?

Without rehashing, the entire discussion in the previous thread, I'm going to repeat that this is an official Arm standard the same as the firmware traps to handle speculative execution mitigations or to standardize platform functionality, ex: PSCI or the recent TRNG code. It also has uses beyond fixing broken hardware.

But similar to those examples, I think everyone here understands the kernel is both a poor place for this kind of logic, while at the same time may not be technically feasible without supplying EL3, management processor code, or traps to said code.

Is it the official position of the Linux kernel maintainers that they will refuse to support future Arm standards in order to gate keep specific hardware platforms?


What's changed?

Well, the code to support this interface is upstream in both TFA, edk2, and various other OS's. So now Linux is trailing.


In my mind, the main thing that's happened since we last discussed this
is that Apple shipped arm64 client hardware with working ECAM. *Apple*
for goodness sake: a company with basically no incentive to follow
standards for their vertically integrated devices! Perhaps others need
to raise their game instead of wasting everybody's time on firmware
hacks; getting the hardware right obviously isn't as difficult as folks
would lead us to believe.

I find it interesting that you hold up the M1 as an example of good hardware. That hardware is one of the worse violators of both platform standards, as well has having a lot of "broken" hardware requiring changes to the kernel that previously were rejected as too far out of line. Never mind, as you point out it has basically zero vendor support and exists only due to a large reverse engineering effort.


Thanks for looking at this,