Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm

From: Péter Ujfalusi
Date: Tue May 23 2023 - 05:13:47 EST




On 23/05/2023 10:44, Lukas Wunner wrote:
> On Tue, May 23, 2023 at 09:48:23AM +0300, Péter Ujfalusi wrote:
>> On 22/05/2023 17:31, Lino Sanfilippo wrote:
> [...]
>> This looked promising, however it looks like the UPX-i11 needs the DMI
>> quirk.
>
> Why is that? Is there a fundamental problem with the patch or is it
> a specific issue with that device?

The flood is not detected (if there is a flood at all), interrupt stops
working after about 200 interrupts - in the latest boot at 118th.
I can check this later, likely tomorrow.

>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>> @@ -752,6 +752,55 @@ static bool tpm_tis_req_canceled(struct tpm_chip *chip, u8 status)
>>> return status == TPM_STS_COMMAND_READY;
>>> }
>>>
>>> +static void tpm_tis_handle_irq_storm(struct tpm_chip *chip)
>>> +{
>>> + struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
>>> + int intmask = 0;
>>> +
>>> + dev_err(&chip->dev, HW_ERR
>>> + "TPM interrupt storm detected, polling instead\n");
>>
>> Should this be dev_warn or even dev_info level?
>
> The corresponding message emitted in tpm_tis_core_init() for
> an interrupt that's *never* asserted uses dev_err(), so using
> dev_err() here as well serves consistency:
>
> dev_err(&chip->dev, FW_BUG
> "TPM interrupt not working, polling instead\n");
>
> That way the same severity is used both for the never asserted and
> the never deasserted interrupt case.

Oh, OK.
Is there anything the user can do to have a ERROR less boot?

>
>>> + if (priv->unhandled_irqs > MAX_UNHANDLED_IRQS)
>>> + tpm_tis_handle_irq_storm(chip);
>>
>> Will the kernel step in and disbale the IRQ before we would have
>> detected the storm?
>
> No. The detection of spurious interrupts in note_interrupt()
> hinges on handlers returning IRQ_NONE. And this patch makes
> tis_int_handler() always return IRQ_HANDLED, thus pretending
> success to genirq code.

True, thanks!

>
>>> rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), interrupt);
>>> tpm_tis_relinquish_locality(chip, 0);
>>> if (rc < 0)
>>> - return IRQ_NONE;
>>> + goto unhandled;
>>
>> This is more like an error than just unhandled IRQ. Yes, it was ignored,
>> probably because it is common?
>
> The interrupt may be shared and then it's not an error.

but this is tpm_tis_write32() failing, no? If it is shared interrupt and
we return IRQ_HANDLED unconditionally then I think the core will think
that the interrupt was for this device and it was handled.

--
Péter