Re: [PATCH -tip V2 00/10] workqueue: break affinity initiatively

From: Dexuan-Linux Cui
Date: Tue Dec 22 2020 - 16:40:12 EST


On Fri, Dec 18, 2020 at 8:11 AM Lai Jiangshan <jiangshanlai@xxxxxxxxx> wrote:
>
> From: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx>
>
> 06249738a41a ("workqueue: Manually break affinity on hotplug")
> said that scheduler will not force break affinity for us.
>
> But workqueue highly depends on the old behavior. Many parts of the codes
> relies on it, 06249738a41a ("workqueue: Manually break affinity on hotplug")
> is not enough to change it, and the commit has flaws in itself too.
>
> It doesn't handle for worker detachment.
> It doesn't handle for worker attachement, mainly worker creation
> which is handled by Valentin Schneider's patch [1].
> It doesn't handle for unbound workers which might be possible
> per-cpu-kthread.
>
> We need to thoroughly update the way workqueue handles affinity
> in cpu hot[un]plug, what is this patchset intends to do and
> replace the Valentin Schneider's patch [1]. The equivalent patch
> is patch 10.
>
> Patch 1 fixes a flaw reported by Hillf Danton <hdanton@xxxxxxxx>.
> I have to include this fix because later patches depends on it.
>
> The patchset is based on tip/master rather than workqueue tree,
> because the patchset is a complement for 06249738a41a ("workqueue:
> Manually break affinity on hotplug") which is only in tip/master by now.
>
> And TJ acked to route the series through tip.
>
> Changed from V1:
> Add TJ's acked-by for the whole patchset
>
> Add more words to the comments and the changelog, mainly derived
> from discussion with Peter.
>
> Update the comments as TJ suggested.
>
> Update a line of code as Valentin suggested.
>
> Add Valentin's ack for patch 10 because "Seems alright to me." and
> add Valentin's comments to the changelog which is integral.
>
> [1]: https://lore.kernel.org/r/ff62e3ee994efb3620177bf7b19fab16f4866845.camel@xxxxxxxxxx
> [V1 patcheset]: https://lore.kernel.org/lkml/20201214155457.3430-1-jiangshanlai@xxxxxxxxx/
>
> Cc: Hillf Danton <hdanton@xxxxxxxx>
> Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
> Cc: Qian Cai <cai@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Vincent Donnefort <vincent.donnefort@xxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
>
> Lai Jiangshan (10):
> workqueue: restore unbound_workers' cpumask correctly
> workqueue: use cpu_possible_mask instead of cpu_active_mask to break
> affinity
> workqueue: Manually break affinity on pool detachment
> workqueue: don't set the worker's cpumask when kthread_bind_mask()
> workqueue: introduce wq_online_cpumask
> workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask()
> workqueue: Manually break affinity on hotplug for unbound pool
> workqueue: reorganize workqueue_online_cpu()
> workqueue: reorganize workqueue_offline_cpu() unbind_workers()
> workqueue: Fix affinity of kworkers when attaching into pool
>
> kernel/workqueue.c | 214 ++++++++++++++++++++++++++++-----------------
> 1 file changed, 132 insertions(+), 82 deletions(-)
>
> --
> 2.19.1.6.gb485710b

Hi,
I tested this patchset on today's tip.git's master branch
(981316394e35 ("Merge branch 'locking/urgent'")).

Every time the kernel boots with 32 CPUs (I'm running the Linux VM on
Hyper-V), I get the below warning.
(BTW, with 8 or 16 CPUs, I don't see the warning).
By printing the cpumasks with "%*pbl", I know the warning happens because:
new_mask = 16-31
cpu_online_mask= 0-16
cpu_active_mask= 0-15
p->nr_cpus_allowed=16

2374 if (p->flags & PF_KTHREAD) {
2375 /*
2376 * For kernel threads that do indeed end up on online &&
2377 * !active we want to ensure they are strict
per-CPU threads.
2378 */
2379 WARN_ON(cpumask_intersects(new_mask, cpu_online_mask) &&
2380 !cpumask_intersects(new_mask, cpu_active_mask) &&
2381 p->nr_cpus_allowed != 1);
2382 }
2383

(FWIW, it looks like this patchset can fix a panic I noticed during
hibernation:
https://lkml.org/lkml/2020/12/22/141, though I see the same warning
during hibernation.)

[ 1.698042] smp: Bringing up secondary CPUs ...
[ 1.701707] x86: Booting SMP configuration:
[ 1.705368] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7
#8 #9 #10 #11 #12 #13 #14 #15
[ 1.721589] .... node #1, CPUs: #16
[ 1.013388] smpboot: CPU 16 Converting physical 0 to logical die 1
[ 1.809716] ------------[ cut here ]------------
[ 1.813553] WARNING: CPU: 16 PID: 90 at kernel/sched/core.c:2381
__set_cpus_allowed_ptr+0x19e/0x1b0
[ 1.813553] Modules linked in:
[ 1.813553] CPU: 16 PID: 90 Comm: cpuhp/16 Not tainted 5.10.0+ #1
[ 1.813553] Hardware name: Microsoft Corporation Virtual
Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 1.813553] RIP: 0010:__set_cpus_allowed_ptr+0x19e/0x1b0
[ 1.813553] Code: e8 e7 a3 39 00 85 c0 74 a7 ba 00 02 00 00 48 c7
c6 20 4b 9b 84 4c 89 ff e8 cf a3 39 00 85 c0 75 8f 83 bb a0 03 00 00
01 74 86 <0f> 0b eb 82 e8 49 ba 74 00 66 0f 1f 84 00 00 00 00 00 0f 1f
44 00
[ 1.813553] RSP: 0000:ffffba9bc1ca7cf8 EFLAGS: 00010016
[ 1.813553] RAX: 0000000000000000 RBX: ffff98ed48d58000 RCX: 0000000000000008
[ 1.813553] RDX: 0000000000000200 RSI: ffffffff849b4b20 RDI: ffff98ed48d035a8
[ 1.813553] RBP: ffff98ed42a2ac00 R08: 0000000000000008 R09: 0000000000000008
[ 1.813553] R10: ffff98ed48d035a8 R11: ffffffff8484da40 R12: 0000000000000000
[ 1.813553] R13: 0000000000000010 R14: ffffffff849b4ba0 R15: ffff98ed48d035a8
[ 1.813553] FS: 0000000000000000(0000) GS:ffff98ee3aa00000(0000)
knlGS:0000000000000000
[ 1.813553] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.813553] CR2: 0000000000000000 CR3: 000000019980a001 CR4: 00000000003706e0
[ 1.813553] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1.813553] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1.813553] Call Trace:
[ 1.813553] worker_attach_to_pool+0x53/0xd0
[ 1.813553] create_worker+0xf9/0x190
[ 1.813553] alloc_unbound_pwq+0x3a5/0x3b0
[ 1.813553] wq_update_unbound_numa+0x112/0x1c0
[ 1.813553] workqueue_online_cpu+0x1d0/0x220
[ 1.813553] ? workqueue_prepare_cpu+0x70/0x70
[ 1.813553] cpuhp_invoke_callback+0x82/0x4a0
[ 1.813553] ? sort_range+0x20/0x20
[ 1.813553] cpuhp_thread_fun+0xb8/0x120
[ 1.813553] smpboot_thread_fn+0x198/0x230
[ 1.813553] kthread+0x13d/0x160
[ 1.813553] ? kthread_create_on_node+0x60/0x60
[ 1.813553] ret_from_fork+0x22/0x30
[ 1.813553] ---[ end trace bc73d8bab71235fe ]---
[ 1.817553] #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31
[ 1.826499] smp: Brought up 2 nodes, 32 CPUs
[ 1.833345] smpboot: Max logical packages: 2
[ 1.833574] smpboot: Total of 32 processors activated (146959.07 BogoMIPS)


Thanks,
Dexuan