Re: [PATCH] irq: Resolve that mask_irq/unmask_irq may not be called in pairs

From: Thomas Gleixner
Date: Wed Dec 13 2023 - 09:59:37 EST


On Wed, Dec 13 2023 at 10:29, xiongxin wrote:
> 在 2023/12/12 23:17, Thomas Gleixner 写道:
> Sorry, the previous reply may not have clarified the BUG process. I
> re-debugged and confirmed it yesterday. The current BUG execution
> sequence is described as follows:

It's the sequence how this works and it works correctly.

Just because it does not work on your machine it does not mean that this
is incorrect and a BUG.

You are trying to fix a symptom and thereby violating guarantees of the
core code.

> That is, there is a time between the 1:handle_level_irq() and
> 3:irq_thread_fn() calls for the 2:disable_irq() call to acquire the lock
> and then implement the irq_state_set_disabled() operation. When finally
> call irq_thread_fn()->irq_finalize_oneshot(), it cannot enter the
> unmask_thread_irq() process.

Correct, because the interrupt has been DISABLED in the mean time.

> In this case, the gpio irq_chip irq_mask()/irq_unmask() callback pairs
> are not called in pairs, so I think this is a BUG, but not necessarily
> fixed from the irq core code layer.

No. It is _NOT_ a BUG. unmask() is not allowed to be invoked when the
interrupt is DISABLED. That's the last time I'm going to tell you that.
Only enable_irq() can undo the effect of disable_irq(), period.

> Next, when the gpio controller driver calls the suspend/resume process,
> it is as follows:
>
> suspend process:
> dwapb_gpio_suspend()
> ctx->int_mask = dwapb_read(gpio, GPIO_INTMASK);
>
> resume process:
> dwapb_gpio_resume()
> dwapb_write(gpio, GPIO_INTMASK, ctx->int_mask);

Did you actually look at the sequence I gave you?

Suspend:

i2c_hid_core_suspend()
disable_irq(); <- Marks it disabled and eventually
masks it.

gpio_irq_suspend()
save_registers(); <- Saves masked interrupt

Resume:

gpio_irq_resume()
restore_registers(); <- Restores masked interrupt

i2c_hid_core_resume()
enable_irq(); <- Unmasks interrupt and removes the
disabled marker


Have you verified that this order of invocations is what happens on
your machine?

Thanks,

tglx