Re: tpm_sis IRQ storm on ThinkStation P360 Tiny

From: Jerry Snitselaar
Date: Fri May 05 2023 - 11:06:49 EST


On Fri, May 05, 2023 at 03:07:31PM +0200, Peter Zijlstra wrote:
> Hi,
>
> I recently saw my Alderlake NUC spewing on boot:
>
> [ 13.166514] irq 109: nobody cared (try booting with the "irqpoll" option)
> [ 13.166614] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.3.0+ #66
> [ 13.166694] Hardware name: LENOVO 30FBS0B800/330E, BIOS M4GKT18A 04/26/2022
> [ 13.166779] Call Trace:
> [ 13.166812] <IRQ>
> [ 13.166840] dump_stack_lvl+0x5b/0x90
> [ 13.166891] __report_bad_irq+0x2b/0xc0
> [ 13.166941] note_interrupt+0x2ac/0x2f0
> [ 13.166991] handle_irq_event+0x6f/0x80
> [ 13.167041] handle_fasteoi_irq+0x94/0x1f0
> [ 13.167093] __common_interrupt+0x72/0x160
> [ 13.167112] intel_rapl_common: Found RAPL domain package
> [ 13.167141] common_interrupt+0xb8/0xe0
> [ 13.167200] intel_rapl_common: Found RAPL domain core
> [ 13.167242] </IRQ>
> [ 13.167297] intel_rapl_common: Found RAPL domain uncore
> [ 13.167322] <TASK>
> [ 13.167437] asm_common_interrupt+0x26/0x40
> [ 13.167492] RIP: 0010:cpuidle_enter_state+0xff/0x500
> [ 13.167554] Code: c0 48 0f a3 05 72 34 ad 01 0f 82 fc 02 00 00 31 ff e8 35 b3 52 ff 45 84 ff 0f 85 cc 02 00 00 e8 f7 13 64 ff fb 0f 1f 44 00 00 <45> 85 f6 0f 88 eb 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
> [ 13.167766] RSP: 0018:ffffc900001ebe90 EFLAGS: 00000206
> [ 13.167843] RAX: 000000000012a8f3 RBX: ffffe8ffff480a00 RCX: 0000000000000000
> [ 13.167928] RDX: 0000000000000000 RSI: ffffffff8244c6ee RDI: ffffffff8242ca22
> [ 13.168021] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000001
> [ 13.168105] R10: 0000000000000003 R11: 000000000000000a R12: ffffffff83625a80
> [ 13.168189] R13: 0000000310c8ee5e R14: 0000000000000001 R15: 0000000000000000
> [ 13.168289] cpuidle_enter+0x2d/0x40
> [ 13.168339] do_idle+0x231/0x290
> [ 13.168383] cpu_startup_entry+0x1d/0x20
> [ 13.168432] start_secondary+0x11b/0x140
> [ 13.168482] secondary_startup_64_no_verify+0xf9/0xfb
> [ 13.168549] </TASK>
> [ 13.168587] handlers:
> [ 13.168617] [<00000000497ef927>] irq_default_primary_handler threaded [<00000000cf102de1>] tis_int_handler
> [ 13.168767] Disabling IRQ #109
>
> this is apparently:
>
> root@alderlake:~# cat /proc/interrupts | grep 109
> 109: 0 0 0 0 0 100002 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 IR-IO-APIC 109-fasteoi tpm0
>
> the TPM thing, which per same dmesg above is:
>
> [ 10.948058] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1D, rev-id 54)
>
> Booting with tpm_tis.interrupts=0 seems to cure things, and AFAICT the
> tpm device actually works -- that is, tpm2 getcap -l and tpm2 pcrread
> both give output, I'm presuming this is 'good'. I've never operated a
> TPM before.
>
> The machine in question is:
>
> Manufacturer: LENOVO
> Product Name: 30FBS0B800
> Version: ThinkStation P360 Tiny
>
> So I'm thinking that perhaps Lenovo carried the bug mentioned in commit:
> b154ce11ead9 ("tpm_tis: Disable interrupts on ThinkPad T490s") to more
> products.

Hi Peter,

It will poll like it has for years with tpm_tis.interrupts=0 so that
should be working as it was prior to 6.3 when interrupts were re-enabled
for tpm_tis. Are you seeing this with 6.2 as well? IIRC with that Thinkpad
case is when it was first realized that interrupts had accidentally been
disabled for tpm_tis at one point by a change.

I guess myself or someone else needs to revisit catching this in
general when the irq storm happens, and disabling interrupts for
tpm_tis. I think last time I was incorporating some feedback from
tglx, let my adhd get me distracted with some other issue and never
returned to it.

The diff below should (compile tested) work for the P360, but
tpm_tis.interrupts=0 is a good work-around.

Regards,
Jerry


diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 7af389806643..12dfdbef574d 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -122,6 +122,14 @@ static const struct dmi_system_id tpm_tis_dmi_table[] = {
DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad T490s"),
},
},
+ {
+ .callback = tpm_tis_disable_irq,
+ .ident = "ThinkStation P360 Tiny",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkStation P360 Tiny"),
+ },
+ },
{}
};