Re: [PATCH v4] PCI: dwc: pci-dra7xx: Fix MSI IRQ handling

From: Vignesh Raghavendra
Date: Tue Mar 31 2020 - 07:06:13 EST




On 30/03/20 10:07 pm, Bjorn Helgaas wrote:
> On Mon, Mar 30, 2020 at 11:29:52AM -0500, Bjorn Helgaas wrote:
>> [+cc Marc, Thomas]
>>
>> On Fri, Mar 27, 2020 at 03:24:34PM +0530, Vignesh Raghavendra wrote:
>>> Due an issue with PCIe wrapper around DWC PCIe IP on dra7xx, driver
>>> needs to ensure that there are no pending MSI IRQ vector set (i.e
>>> PCIE_MSI_INTR0_STATUS reads 0 at least once) before exiting IRQ handler.
>>> Else, the dra7xx PCIe wrapper will not register new MSI IRQs even though
>>> PCIE_MSI_INTR0_STATUS shows IRQs are pending.
>>
>> I'm not an IRQ guy (real IRQ guys CC'd), but I'm wondering if this is
>> really a symptom of a problem in the generic DWC IRQ handling, not a
>> problem in dra7xx itself.
>>
>> I thought it was sort of standard behavior that a device would not
>> send a new MSI unless there was a transition from "no status bits set"
>> to "at least one status bit set". I'm looking at this text from the
>> PCIe r5.0 spec, sec 6.7.3.4:


This patch is addressing an issue wrt how DWC PCIe MSI IRQ status is
aggregated at TI wrapper level:

There is a single MSI status bit which is supposed to be logical OR of
all the MSI IRQ status bits inside the DWC wrapper. So if any of the
MSI IRQ status bits are set then this bit should read 1 and raise an
interrupt to CPU.
IRQ handler would then go through each MSI bit (inside DWC) and call
corresponding handler and then clear individual bits.
So the expectation was that wrapper level MSI status bit would auto
clear once all the DWC level MSI bits are cleared or wrapper will keep
the interrupt line asserted if there are still some outstanding ones.

But unfortunately that does not seem to be the case, MSI status bit in
the wrapper needs to be cleared manually. And moreover, once wrapper
level bit is cleared, the observation is that all the IRQ status bit
inside the DWC should be handled completely (i.e all the registers
should read 0 at least once) and only then is a new MSI IRQ guaranteed
to be recognized by the wrapper.

During debug without this commit, I often saw that DWC level MSI bit was
set (and IRQ status in endpoint's register was also set) but wrapper
level MSI bit was not set and host CPU never received an interrupt
therefore causing endpoint drivers to timeout waiting for certain events.

Regards
Vignesh

>>
>> If the Port is enabled for edge-triggered interrupt signaling using
>> MSI or MSI-X, an interrupt message must be sent every time the
>> logical AND of the following conditions transitions from FALSE to
>> TRUE:
>>
>> - The associated vector is unmasked (not applicable if MSI does
>> not support PVM).
>>
>> - The Hot-Plug Interrupt Enable bit in the Slot Control register
>> is set to 1b.
>>
>> - At least one hot-plug event status bit in the Slot Status
>> register and its associated enable bit in the Slot Control
>> register are both set to 1b.
>>
>> and this related commit: https://git.kernel.org/linus/fad214b0aa72
>
> and this one: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?id=87d94ad41bd2
>

I am not sure how these fixes can be relevant to my problem.