Re: [PATCH 14/24] workqueue: Generalize unbound CPU pods

From: Tejun Heo
Date: Thu Jun 08 2023 - 18:50:31 EST


Hello,

On Thu, Jun 08, 2023 at 08:31:34AM +0530, K Prateek Nayak wrote:
...
> Thank you for sharing the debug branch. I've managed to hit some one of
> the WARN_ON_ONCE() consistently but I still haven't seen a kernel panic
> yet. Sharing the traces below:

Yeah, that's good. It does a dirty fix-up. Shouldn't crash.

> o Early Boot
>
> [ 4.182411] ------------[ cut here ]------------
> [ 4.186313] WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:1130 kick_pool+0xdb/0xe0
> [ 4.186313] Modules linked in:
> [ 4.186313] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc1-tj-wq-valid-cpu+ #481
> [ 4.186313] Hardware name: Dell Inc. PowerEdge R6525/024PW1, BIOS 2.7.3 03/30/2022
> [ 4.186313] RIP: 0010:kick_pool+0xdb/0xe0
> [ 4.186313] Code: 6b c0 d0 01 73 24 41 89 45 64 49 8b 54 24 f8 48 89 d0 30 c0 83 e2 04 ba 00 00 00 00 48 0f 44 c2 48 83 80 c0 00 00 00 01 eb 82 <0f> 0b eb dc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f
> [ 4.186313] RSP: 0018:ffffbc1b800e7dd8 EFLAGS: 00010046
> [ 4.186313] RAX: 0000000000000100 RBX: ffff97c73d2321c0 RCX: 0000000000000000
> [ 4.186313] RDX: 0000000000000040 RSI: 0000000000000001 RDI: ffff9788c0159728
> [ 4.186313] RBP: ffffbc1b800e7df0 R08: 0000000000000100 R09: ffff9788c01593e0
> [ 4.186313] R10: ffff9788c01593c0 R11: 0000000000000001 R12: ffffffff8c582430
> [ 4.186313] R13: ffff9788c03fcd40 R14: 0000000000000000 R15: ffff97c73d2324b0
> [ 4.186313] FS: 0000000000000000(0000) GS:ffff97c73d200000(0000) knlGS:0000000000000000
> [ 4.186313] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4.186313] CR2: ffff97cecee01000 CR3: 000000470d43a001 CR4: 0000000000770ef0
> [ 4.186313] PKRU: 55555554
> [ 4.186313] Call Trace:
> [ 4.186313] <TASK>
> [ 4.186313] create_worker+0x14e/0x280
> [ 4.186313] ? wake_up_process+0x15/0x20
> [ 4.186313] workqueue_init+0x22a/0x3d0
> [ 4.186313] kernel_init_freeable+0x1fe/0x4f0
> [ 4.186313] ? __pfx_kernel_init+0x10/0x10
> [ 4.186313] kernel_init+0x1b/0x1f0
> [ 4.186313] ? __pfx_kernel_init+0x10/0x10
> [ 4.186313] ret_from_fork+0x2c/0x50
> [ 4.186313] </TASK>
> [ 4.186313] ---[ end trace 0000000000000000 ]---
>
> o I consistently see a WARN_ON_ONCE() in kick_pool() being hit when I
> run "sudo ./stress-ng --iomix 96 --timeout 1m". I've seen few
> different stack traces so far. Including all below just in case:
...
> This is the same WARN_ON_ONCE() you had added in the HEAD commit:
>
> $ scripts/faddr2line vmlinux kick_pool+0xdb
> kick_pool+0xdb/0xe0:
> kick_pool at kernel/workqueue.c:1130 (discriminator 1)
>
> $ sed -n 1130,1132p kernel/workqueue.c
> if (!WARN_ON_ONCE(wake_cpu >= nr_cpu_ids))
> p->wake_cpu = wake_cpu;
> get_work_pwq(work)->stats[PWQ_STAT_REPATRIATED]++;
>
> Let me know if you need any more data from my test setup.
> P.S. The kernel is still up and running (~30min) despite hitting this
> WARN_ON_ONCE() in my case :)

Okay, that was me being stupid and not initializing the new fields for
per-cpu workqueues. Can you please test the following branch? It should have
both bugs fixed properly.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git affinity-scopes-v2

If that doesn't crash, I'd love to hear how it affects the perf regressions
reported over that past few months.

Thanks.

--
tejun