Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm

From: Hans de Goede
Date: Tue May 23 2023 - 05:21:04 EST


Hi,

On 5/23/23 11:14, Péter Ujfalusi wrote:
>
>
> On 23/05/2023 10:44, Lukas Wunner wrote:
>> On Tue, May 23, 2023 at 09:48:23AM +0300, Péter Ujfalusi wrote:
>>> On 22/05/2023 17:31, Lino Sanfilippo wrote:
>> [...]
>>> This looked promising, however it looks like the UPX-i11 needs the DMI
>>> quirk.
>>
>> Why is that? Is there a fundamental problem with the patch or is it
>> a specific issue with that device?
>
> The flood is not detected (if there is a flood at all), interrupt stops
> working after about 200 interrupts - in the latest boot at 118th.
> I can check this later, likely tomorrow.
>
>>>> --- a/drivers/char/tpm/tpm_tis_core.c
>>>> +++ b/drivers/char/tpm/tpm_tis_core.c
>>>> @@ -752,6 +752,55 @@ static bool tpm_tis_req_canceled(struct tpm_chip *chip, u8 status)
>>>> return status == TPM_STS_COMMAND_READY;
>>>> }
>>>>
>>>> +static void tpm_tis_handle_irq_storm(struct tpm_chip *chip)
>>>> +{
>>>> + struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
>>>> + int intmask = 0;
>>>> +
>>>> + dev_err(&chip->dev, HW_ERR
>>>> + "TPM interrupt storm detected, polling instead\n");
>>>
>>> Should this be dev_warn or even dev_info level?
>>
>> The corresponding message emitted in tpm_tis_core_init() for
>> an interrupt that's *never* asserted uses dev_err(), so using
>> dev_err() here as well serves consistency:
>>
>> dev_err(&chip->dev, FW_BUG
>> "TPM interrupt not working, polling instead\n");
>>
>> That way the same severity is used both for the never asserted and
>> the never deasserted interrupt case.
>
> Oh, OK.
> Is there anything the user can do to have a ERROR less boot?

That is a good point. Even though the typical dmesg has at least some false-positive error messages I believe we should still strive to not log errors in cases where there is not e.g. an actual error which the user needs to care about (e.g. disk IO errors).

Usually in similar cases like this, where we are basically correcting for firmware bugs (1) we use:

dev_warn(dev, FW_BUG "...", ...);

maybe we should switch both messages here to this ?

FW_BUG is: defined in linux/printk as:

#define FW_BUG "[Firmware Bug]: "

And we are trying to use this in places like this both for uniformity of reporting these kinda bugs and to allow grepping for it.

Regards,

Hans



1) providing wrong / non working ACPI IRQ resources in this case