Re: [PATCH 01/17] PCI: Add concurrency safe clear_and_set variants for LNKCTL{,2}

From: Bjorn Helgaas
Date: Mon May 15 2023 - 14:28:41 EST


On Mon, May 15, 2023 at 02:59:42PM +0300, Ilpo Järvinen wrote:
> On Sun, 14 May 2023, Lukas Wunner wrote:
> > On Fri, May 12, 2023 at 11:25:32AM +0300, Ilpo Järvinen wrote:
> > > On Thu, 11 May 2023, Lukas Wunner wrote:
> > > > On Thu, May 11, 2023 at 10:55:06AM -0500, Bjorn Helgaas wrote:
> > > > > I didn't see the prior discussion with Lukas, so maybe this was
> > > > > answered there, but is there any reason not to add locking to
> > > > > pcie_capability_clear_and_set_word() and friends directly?
> > > > >
> > > > > It would be nice to avoid having to decide whether to use the locked
> > > > > or unlocked versions.
> > > >
> > > > I think we definitely want to also offer lockless accessors which
> > > > can be used in hotpaths such as interrupt handlers if the accessed
> > > > registers don't need any locking (e.g. because there are no concurrent
> > > > accesses).
> > > > ...

> All PCI_EXP_SLTSTA ones looked not real RMW but ACK bits type of writes

PCI_EXP_SLTSTA, PCI_EXP_LNKSTA, etc are typically RW1C and do not need
the usual RMW locking (which I think is what you were saying).

> > ...
> > What I think is unnecessary and counterproductive is to add wholesale
> > locking of any access to the PCI Express Capability Structure.
> >
> > It's fine to have a single spinlock, but I'd suggest only using it
> > for registers which are actually accessed concurrently by multiple
> > places in the kernel.
>
> While it does feel entirely unnecessary layer of complexity to me, it would
> be possible to rename the original pcie_capability_clear_and_set_word() to
> pcie_capability_clear_and_set_word_unlocked() and add this into
> include/linux/pci.h:
>
> static inline int pcie_capability_clear_and_set_word(struct pci_dev *dev,
> int pos, u16 clear, u16 set)
> {
> if (pos == PCI_EXP_LNKCTL || pos == PCI_EXP_LNKCTL2 ||
> pos == PCI_EXP_RTCTL)
> pcie_capability_clear_and_set_word_locked(...);
> else
> pcie_capability_clear_and_set_word_unlocked(...);
> }
>
> It would keep the interface exactly the same but protect only a selectable
> set of registers. As pos is always a constant, the compiler should be able
> to optimize all the dead code away.
>
> Would that be ok then?

Sounds like you have a pretty strong opinion, Lukas, but I guess I
don't really understand the value of having locked and unlocked
variants of RMW accessors. Config accesses are relatively slow and I
don't think they're used in performance-sensitive paths. I would
expect the lock to be uncontended and cheap relative to the config
access itself, but I have no actual numbers to back up my speculation.
Is the performance win worth the extra complexity in callers?

Bjorn