Re: [PATCH 01/17] PCI: Add concurrency safe clear_and_set variants for LNKCTL{,2}

From: Lukas Wunner
Date: Mon May 15 2023 - 18:13:40 EST


On Mon, May 15, 2023 at 02:59:42PM +0300, Ilpo Järvinen wrote:
> While it does feel entirely unnecessary layer of complexity to me, it would
> be possible to rename the original pcie_capability_clear_and_set_word() to
> pcie_capability_clear_and_set_word_unlocked() and add this into
> include/linux/pci.h:
>
> static inline int pcie_capability_clear_and_set_word(struct pci_dev *dev,
> int pos, u16 clear, u16 set)
> {
> if (pos == PCI_EXP_LNKCTL || pos == PCI_EXP_LNKCTL2 ||
> pos == PCI_EXP_RTCTL)
> pcie_capability_clear_and_set_word_locked(...);
> else
> pcie_capability_clear_and_set_word_unlocked(...);
> }
>
> It would keep the interface exactly the same but protect only a selectable
> set of registers. As pos is always a constant, the compiler should be able
> to optimize all the dead code away.

That's actually quite neat, I like it. It documents clearly which
registers need protection because of concurrent RMWs and callers can't
do anything wrong.

Though I'd use a switch/case statement such that future additions
of registers that need protection are always just a clean, one-line
change.

Plus some kernel-doc or code comment to explain that certain
registers in the PCI Express Capability Structure are accessed
concurrently in a RMW fashion, hence require locking.

Since this protects specifically registers in the PCI Express
Capability, whose location is cached in struct pci_dev->pcie_cap,
I'm wondering if pcie_cap_lock is a clearer name.


> PCI_EXP_SLTCTL write is protected by a mutex, it doesn't look something
> that matches your initial concern about "hot paths (e.g. interrupt
> handlers)".

PCI_EXP_SLTCTL is definitely modified from the interrupt handler
pciehp_ist(), but one could argue that hotplug interrupts don't
usually occur *that* often. (We've had interrupt storms though
from broken devices or ones with a shared interrupt etc.)

I guess I'm just generally worried about acquiring a lock that's
not necessary. E.g. on boot, numerous config space accesses are
performed to enumerate and initialize devices and reducing concurrency
might slow down boot times. It's just a risk that I'd recommend
to avoid if possible.

Thanks,

Lukas