Re: [PATCH v4 1/1] bus: mhi: host: Move IRQ allocation to controller registration phase

From: Kalle Valo
Date: Mon Jul 25 2022 - 14:00:21 EST


Manivannan Sadhasivam <mani@xxxxxxxxxx> writes:

> On Wed, Jul 20, 2022 at 05:47:37PM +0800, Qiang Yu wrote:
>
>>
>> On 7/20/2022 5:39 PM, Manivannan Sadhasivam wrote:
>> > On Mon, Jul 18, 2022 at 02:15:23PM +0300, Kalle Valo wrote:
>> > > + ath11k list
>> > >
>> > > Manivannan Sadhasivam <mani@xxxxxxxxxx> writes:
>> > >
>> > > > On Thu, Jun 23, 2022 at 10:43:03AM +0800, Qiang Yu wrote:
>> > > > > During runtime, the MHI endpoint may be powered up/down several times.
>> > > > > So instead of allocating and destroying the IRQs all the time, let's just
>> > > > > enable/disable IRQs during power up/down.
>> > > > >
>> > > > > The IRQs will be allocated during mhi_register_controller() and freed
>> > > > > during mhi_unregister_controller(). This works well for things like PCI
>> > > > > hotplug also as once the PCI device gets removed, the controller will
>> > > > > get unregistered. And once it comes back, it will get registered back
>> > > > > and even if the IRQ configuration changes (MSI), that will get accounted.
>> > > > >
>> > > > > Signed-off-by: Qiang Yu <quic_qianyu@xxxxxxxxxxx>
>> > > > Applied to mhi-next!
>> > > I did a bisect and this patch breaks ath11k during rmmod. I'm on
>> > > vacation right now so I can't investigate in detail but more info below.
>> > >
>> > I just tested linux-next/master next-20220718 on my NUC with QCA6390, but I'm
>> > not able to reproduce the issue during rmmod! Instead I couldn't connect to AP.
>>
>> I suspect that in __free_irq(), if CONFIG_DEBUG_SHIRQ is enabled, irq
>> handler for a shared IRQ will be invoked and null pointer access happen.
>>
>> #ifdef CONFIG_DEBUG_SHIRQ
>>     /*
>>      * It's a shared IRQ -- the driver ought to be prepared for an IRQ
>>      * event to happen even now it's being freed, so let's make sure that
>>      * is so by doing an extra call to the handler ....
>>      *
>>      * ( We do this after actually deregistering it, to make sure that a
>>      *   'real' IRQ doesn't run in parallel with our fake. )
>>      */
>>     if (action->flags & IRQF_SHARED) {
>>         local_irq_save(flags);
>>         action->handler(irq, dev_id);
>>         local_irq_restore(flags);
>>     }
>> #endif
>>
>
> Ah yes, after enabling CONFIG_DEBUG_SHIRQ I could reproduce the issue.

So how to fix this regression? (If there's already a fix I might have
missed it as I came back only today)

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches