Re: [PATCH V3] tracing/timerlat: Hotplug support for the user-space interface

From: Steven Rostedt
Date: Wed Oct 04 2023 - 08:16:31 EST


On Wed, 4 Oct 2023 12:02:52 +0200
Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> wrote:

> On 10/4/23 03:03, Steven Rostedt wrote:
> > On Fri, 29 Sep 2023 17:02:46 +0200
> > Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> wrote:
> >
> >> The osnoise/per_cpu/CPU$/timerlat_fd is create for each possible
> >> CPU, but it might create confusion if the CPU is not online.
> >>
> >> Create the file only for online CPUs, also follow hotplug by
> >> creating and deleting as CPUs come and go.
> >>
> >> Fixes: e88ed227f639 ("tracing/timerlat: Add user-space interface")
> >
> > Is this a fix that needs to go in now and Cc'd to stable? Or is this
> > something that can wait till the next merge window?
>
> We can wait for the next merge window... it is a non-trivial fix.
>

A requirement is if it's a fix, not really how "trivial" it is.

That said, I'm able to consistently triggered this:

BUG: kernel NULL pointer dereference, address: 00000000000000a0
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 1 PID: 20 Comm: cpuhp/1 Not tainted 6.6.0-rc4-test-00008-g2df8f295b0e2 #103
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:down_write+0x23/0x70
Code: 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 53 48 89 fb e8 2e bc ff ff bf 01 00 00 00 e8 24 14 31 ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 33 65 48 8b 04 25 00 36 03 00 48 89 43 08 bf 01
RSP: 0018:ffffb17f800e3d98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000000a0 RCX: ffffff8100000000
RDX: 0000000000000001 RSI: 0000000000000064 RDI: ffffffffb6edd5cc
RBP: ffffb17f800e3df8 R08: ffff8c6237c61188 R09: 000000008020001b
R10: ffff8c6237c61160 R11: 0000000000000001 R12: 000000000002da30
R13: 0000000000000000 R14: ffffffffb6314080 R15: ffff8c6237c61188
FS: 0000000000000000(0000) GS:ffff8c6237c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 0000000102412001 CR4: 0000000000170ee0
Call Trace:
<TASK>
? __die+0x23/0x70
? page_fault_oops+0x17d/0x4c0
? exc_page_fault+0x7f/0x180
? asm_exc_page_fault+0x26/0x30
? __pfx_osnoise_cpu_die+0x10/0x10
? down_write+0x1c/0x70
? down_write+0x23/0x70
? down_write+0x1c/0x70
simple_recursive_removal+0xef/0x280
? __pfx_remove_one+0x10/0x10
? __pfx_osnoise_cpu_die+0x10/0x10
tracefs_remove+0x44/0x70
timerlat_rm_per_cpu_interface+0x28/0x70
osnoise_cpu_die+0xf/0x20
cpuhp_invoke_callback+0xf8/0x460
? __pfx_smpboot_thread_fn+0x10/0x10
cpuhp_thread_fun+0xf3/0x190
smpboot_thread_fn+0x18c/0x230
kthread+0xf7/0x130
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
Modules linked in:
CR2: 00000000000000a0
---[ end trace 0000000000000000 ]---
RIP: 0010:down_write+0x23/0x70
Code: 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 53 48 89 fb e8 2e bc ff ff bf 01 00 00 00 e8 24 14 31 ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 33 65 48 8b 04 25 00 36 03 00 48 89 43 08 bf 01
RSP: 0018:ffffb17f800e3d98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000000a0 RCX: ffffff8100000000
RDX: 0000000000000001 RSI: 0000000000000064 RDI: ffffffffb6edd5cc
RBP: ffffb17f800e3df8 R08: ffff8c6237c61188 R09: 000000008020001b
R10: ffff8c6237c61160 R11: 0000000000000001 R12: 000000000002da30
R13: 0000000000000000 R14: ffffffffb6314080 R15: ffff8c6237c61188
FS: 0000000000000000(0000) GS:ffff8c6237c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 0000000102412001 CR4: 0000000000170ee0
note: cpuhp/1[20] exited with irqs disabled
note: cpuhp/1[20] exited with preempt_count 1


With running the attached script as:

# ./ftrace-test-tracers sleep 1

-- Steve

Attachment: ftrace-test-tracers
Description: Binary data