Re: [PATCH pci] PCI: don't skip probing entire device if first fn OF node has status = "disabled"

From: Bjorn Helgaas
Date: Thu Jun 01 2023 - 11:45:14 EST


On Thu, Jun 01, 2023 at 11:11:56AM +0300, Vladimir Oltean wrote:
> On Wed, May 31, 2023 at 03:24:46PM -0500, Bjorn Helgaas wrote:
> > I guess I should have asked "what bad things happen without this patch
> > and without the DT 'disabled' status"?
>
> Well, now that you put it this way, I do realize that things are not so
> ideal for me.
>
> Our drivers for the functions of this device were already checking for
> of_device_is_available() during probe. So, reverting the core PCIe
> patch, they would still not register a network interface, which is good.
>
> However (and this is the bad part), multiple functions of this PCIe
> device unfortunately share a common memory, which is not zeroized by
> hardware, and so, to avoid multi-bit ECC errors, it must be zeroized by
> software, using some memory space accesses from all functions that have
> access to that shared memory (every function zeroizes its piece of it).
> This, sadly, includes functions which have status = "disabled". See
> commit 3222b5b613db ("net: enetc: initialize RFS/RSS memories for unused
> ports too").
>
> What we used to do was start probing a bit in enetc_pf_probe(), enable
> the memory space, zeroize our part of the shared memory, then check
> of_device_is_available() and finally, we disable the memory space again
> and exit probing with -ENODEV.
>
> That is not possible anymore with the core patch, because the PCIe core
> will not probe our disabled functions at all anymore.

To make sure I understand you, I think you're saying that if Function
0 has DT status "disabled", 6fffbc7ae137 ("PCI: Honor firmware's
device disabled status") breaks things because we don't enumerate
Function 0 and the driver can't temporarily claim it to zero out its
piece of the shared memory.

With just 6fffbc7ae137, we don't enumerate Function 0, which means we
don't see that it's a multi-function device, so we don't enumerate
Functions 1, 2, etc, either.

With both 6fffbc7ae137 and your current patch, we would enumerate
Functions 1, 2, etc, but we still skip Function 0, so its piece of the
shared memory still doesn't get zeroed.

> The ENETC is not a hot-pluggable PCIe device. It uses Enhanced Allocation
> to essentially describe on-chip memory spaces, which are always present.
> So presumably, a different system-level solution to initialize those
> shared memories (U-Boot?) may be chosen, if implementing this workaround
> in Linux puts too much pressure on the PCIe core and the way in which it
> does things. Initially I didn't want to do this in prior boot stages
> because we only enable the RCEC in Linux, nothing is broken other than
> the spurious AER messages, and, you know.. the kernel may still run
> indefinitely on top of bootloaders which don't have the workaround applied.
> So working around it in Linux avoids one dependency.

If I understand correctly, something (bootloader or Linux) needs to do
something to Function 0 (e.g., clear memory). Doing it in Linux would
minimize dependences on the bootloader, so that seems desirable to me.
That means Linux needs to enumerate Function 0 so it is visible to a
driver or possibly a quirk.

I think we could contemplate implementing 6fffbc7ae137 in a different
way. Checking DT status at driver probe-time would probably work for
Loongson, but wouldn't quite solve the NXP problem because the driver
wouldn't be able to claim Function 0 even temporarily.

Is DT the only way to learn the NXP SERDES configuration? I think it
would be much better if there were a way to programmatically learn it,
because then you wouldn't have to worry about syncing the DT with the
platform configuration, and it would decouple this from the Loongson
situation.

(If there were a way to actually discover the Loongson situation
instead of relying on DT, e.g., by keying off a Device ID or
something, that would be much better, too. I assume we explored that,
but I don't remember the details.)

Bjorn