Re: [regression] Bug 217386 - intel_powerclamp null pointer dereference in Linux 6.3.x

From: Rafael J. Wysocki
Date: Wed May 03 2023 - 08:34:44 EST


CC: Srinivas

On Tue, May 2, 2023 at 12:26 PM Linux regression tracking (Thorsten
Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
>
>
> Hi, Thorsten here, the Linux kernel's regression tracker.
>
> I noticed a regression report in bugzilla.kernel.org. As many (most?)
> kernel developers don't keep an eye on it, I decided to forward it by mail.
>
> Note, you have to use bugzilla to reach the reporter, as I sadly[1] can
> not CCed them in mails like this.
>
> Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=217386 :
>
> > Risto A. Paju 2023-05-01 09:49:03 UTC
> >
> > Created attachment 304199 [details]
> > dmesg-6.3.1
> >
> > I use intel_powerclamp on a Thinkpad X220i via a custom script for
> > thermal management, and it triggers a kernel bug in Linux 6.3.0 and
> > 6.3.1. The script has a part
> >
> > awk something > /sys/class/thermal/cooling_device4/cur_state
> >
> > and the awk process hangs, with the null pointer bug reported in dmesg.
> >
> > The script has worked fine for years, and for now I've switched back to
> > the 6.2 series for this laptop.
> >
> > [tag] [reply] [−]
> > Private
> > Comment 1 Risto A. Paju 2023-05-01 12:23:15 UTC
> >
> > The affected CPU is an i3-2310M. I tested the same on a newer Intel
> > laptop with an i5-7300HQ, and there's no sign of the bug there.
>
> From the dmesg:
>
> > [ 16.495596] Oops: 0002 [#1] PREEMPT SMP PTI
> > [ 16.496084] CPU: 0 PID: 2792 Comm: awk Not tainted 6.3.1 #2
> > [ 16.496589] Hardware name: LENOVO 428737G/428737G, BIOS 8DET76WW (1.46 ) 06/21/2018
> > [ 16.497095] RIP: 0010:idle_inject_set_duration+0x6/0x20
> > [ 16.497607] Code: 00 49 c7 c4 f4 ff ff ff eb 92 49 c7 c4 f4 ff ff ff eb 91 cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 f0 01 d0 74 06 <89> 77 44 89 57 40 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00
> > [ 16.498824] RSP: 0018:ffffc900005e7de8 EFLAGS: 00010206
> > [ 16.499461] RAX: 00000000000927c0 RBX: 0000000000000000 RCX: 0000000000001770
> > [ 16.500103] RDX: 0000000000001770 RSI: 0000000000091050 RDI: 0000000000000000
> > [ 16.500752] RBP: 0000000000000002 R08: 0000000000000001 R09: 000000000000000a
> > [ 16.501405] R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
> > [ 16.502071] R13: ffff8881064bd720 R14: ffffc900005e7ea0 R15: ffff8881022283e0
> > [ 16.502745] FS: 00007fe494782b80(0000) GS:ffff888216200000(0000) knlGS:0000000000000000
> > [ 16.503421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 16.504095] CR2: 0000000000000044 CR3: 000000010f94c006 CR4: 00000000000606f0
> > [ 16.504783] Call Trace:
> > [ 16.505456] <TASK>
> > [ 16.506124] powerclamp_set_cur_state+0x56/0x200 [intel_powerclamp]
> > [ 16.506810] cur_state_store+0x74/0xd0
> > [ 16.507497] kernfs_fop_write_iter+0x128/0x1c0
> > [ 16.508193] vfs_write+0x2be/0x3f0
> > [ 16.508897] ksys_write+0x5a/0xe0
> > [ 16.509794] do_syscall_64+0x3b/0x90
> > [ 16.510607] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > [ 16.511444] RIP: 0033:0x7fe4948c0be0
> > [ 16.512247] Code: 40 00 48 8b 15 49 c2 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d 01 4a 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
> > [ 16.514025] RSP: 002b:00007ffc045bd0d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> > [ 16.515180] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fe4948c0be0
> > [ 16.516346] RDX: 0000000000000002 RSI: 0000562bc4d50990 RDI: 0000000000000001
> > [ 16.517543] RBP: 00007fe49499e780 R08: 0000000000000007 R09: 0000562bc4d46da0
> > [ 16.518751] R10: 00007fe4947d5f50 R11: 0000000000000202 R12: 0000000000000002
> > [ 16.519977] R13: 0000562bc4d50990 R14: 0000000000000002 R15: 00007fe494999d60
> > [ 16.521234] </TASK>
> > [ 16.522479] Modules linked in:
> > [...]
> > [ 16.534416] CR2: 0000000000000044
> > [ 16.536059] ---[ end trace 0000000000000000 ]---
> > [ 16.537819] RIP: 0010:idle_inject_set_duration+0x6/0x20
> > [ 16.537827] Code: 00 49 c7 c4 f4 ff ff ff eb 92 49 c7 c4 f4 ff ff ff eb 91 cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 f0 01 d0 74 06 <89> 77 44 89 57 40 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00
> > [ 16.543180] RSP: 0018:ffffc900005e7de8 EFLAGS: 00010206
> > [ 16.543189] RAX: 00000000000927c0 RBX: 0000000000000000 RCX: 0000000000001770
> > [ 16.543193] RDX: 0000000000001770 RSI: 0000000000091050 RDI: 0000000000000000
> > [ 16.549435] RBP: 0000000000000002 R08: 0000000000000001 R09: 000000000000000a
> > [ 16.551087] R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
> > [ 16.551091] R13: ffff8881064bd720 R14: ffffc900005e7ea0 R15: ffff8881022283e0
> > [ 16.554264] FS: 00007fe494782b80(0000) GS:ffff888216200000(0000) knlGS:0000000000000000
> > [ 16.554270] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 16.554281] CR2: 0000560363a98298 CR3: 000000010f94c006 CR4: 00000000000606f0
> > [ 17.635258] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=2842 'X'
>
>
> See the ticket for more details.
>
>
> [TLDR for the rest of this mail: I'm adding this report to the list of
> tracked Linux kernel regressions; the text you find below is based on a
> few templates paragraphs you might have encountered already in similar
> form.]
>
> BTW, let me use this mail to also add the report to the list of tracked
> regressions to ensure it's doesn't fall through the cracks:
>
> #regzbot introduced: v6.2..v6.3
> https://bugzilla.kernel.org/show_bug.cgi?id=217386
> #regzbot title: pm: thermal: intel_powerclamp null pointer dereference
> #regzbot ignore-activity
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply and tell me -- ideally
> while also telling regzbot about it, as explained by the page listed in
> the footer of this mail.
>
> Developers: When fixing the issue, remember to add 'Link:' tags pointing
> to the report (e.g. the buzgzilla ticket and maybe this mail as well, if
> this thread sees some discussion). See page linked in footer for details.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> [1] because bugzilla.kernel.org tells users upon registration their
> "email address will never be displayed to logged out users"