Re: [PATCH] sched/wait: Fix a kthread_park race with wait_woken()

From: John Stultz
Date: Thu May 11 2023 - 20:52:41 EST


On Thu, May 11, 2023 at 3:31 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, May 11, 2023 at 09:41:30PM +0000, John Stultz wrote:
> > From: Arve Hjønnevåg <arve@xxxxxxxxxxx>
> >
> > kthread_park and wait_woken have a similar race that kthread_stop and
> > wait_woken used to have before it was fixed in
> > cb6538e740d7543cd989128625cf8cac4b471e0a. Extend that fix to also cover
>
> cb6538e740d7 ("sched/wait: Fix a kthread race with wait_woken()")
>
> > kthread_park.
> >
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
> > Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Cc: Ben Segall <bsegall@xxxxxxxxxx>
> > Cc: Mel Gorman <mgorman@xxxxxxx>
> > Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
> > Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
> > Signed-off-by: Arve Hjønnevåg <arve@xxxxxxxxxxx>
> > Signed-off-by: John Stultz <jstultz@xxxxxxxxxx>
> > ---
> > This seemingly slipped by, so I wanted to resend it
> > for review.
> > ---
> > kernel/sched/wait.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > index 133b74730738..a9cf49da884b 100644
> > --- a/kernel/sched/wait.c
> > +++ b/kernel/sched/wait.c
> > @@ -425,9 +425,9 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i
> > }
> > EXPORT_SYMBOL(autoremove_wake_function);
> >
> > -static inline bool is_kthread_should_stop(void)
> > +static inline bool is_kthread_should_stop_or_park(void)
> > {
> > - return (current->flags & PF_KTHREAD) && kthread_should_stop();
> > + return (current->flags & PF_KTHREAD) && (kthread_should_stop() || kthread_should_park());
> > }
> >
> > /*
>
> That's a bit sad; that two function calls for checking two consecutive
> bits in the same word :-(
>
> If we move this to kthread.c and write it like:
>
> kthread = __to_kthread(current);
> if (!kthread)
> return false;
>
> return test_bit(KTHREAD_SHOULD_STOP, &kthread->flags) ||
> test_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
>
> Then the compiler should be able to merge the two bits in a single load
> and test due to constant_test_bit() -- do check though.

Hrm. Apologies, as it's been awhile since I've looked at disassembled
asm, so I may be wrong, but I don't think that's happening here.

With the logic above I'm seeing it build as:
0000000000000a50 <kthread_should_stop_or_park>:
a50: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx
a57: 00 00
a59: 48 8b 8a 78 0a 00 00 mov 0xa78(%rdx),%rcx
a60: 31 c0 xor %eax,%eax
a62: 48 85 c9 test %rcx,%rcx
a65: 74 11 je a78
<kthread_should_stop_or_park+0x28>
a67: f6 42 2e 20 testb $0x20,0x2e(%rdx)
a6b: 74 0b je a78
<kthread_should_stop_or_park+0x28>
a6d: 48 8b 01 mov (%rcx),%rax
a70: 48 d1 e8 shr %rax
a73: 83 e0 01 and $0x1,%eax
a76: 74 05 je a7d
<kthread_should_stop_or_park+0x2d>
a78: e9 00 00 00 00 jmp a7d
<kthread_should_stop_or_park+0x2d>
a7d: 48 8b 01 mov (%rcx),%rax
a80: 48 c1 e8 02 shr $0x2,%rax
a84: 83 e0 01 and $0x1,%eax
a87: e9 00 00 00 00 jmp a8c
<kthread_should_stop_or_park+0x3c>
a8c: 0f 1f 40 00 nopl 0x0(%rax)


Where as if I drop the second test_bit() to see if it was similar, I see:
0000000000000a50 <kthread_should_stop_or_park>:
a50: 65 48 8b 14 25 00 00 mov %gs:0x0,%rdx
a57: 00 00
a59: 48 8b 8a 78 0a 00 00 mov 0xa78(%rdx),%rcx
a60: 31 c0 xor %eax,%eax
a62: 48 85 c9 test %rcx,%rcx
a65: 74 14 je a7b
<kthread_should_stop_or_park+0x2b>
a67: f6 42 2e 20 testb $0x20,0x2e(%rdx)
a6b: 74 0e je a7b
<kthread_should_stop_or_park+0x2b>
a6d: 48 8b 01 mov (%rcx),%rax
a70: 48 d1 e8 shr %rax
a73: 83 e0 01 and $0x1,%eax
a76: e9 00 00 00 00 jmp a7b
<kthread_should_stop_or_park+0x2b>
a7b: e9 00 00 00 00 jmp a80
<__pfx_kthread_freezable_should_stop>


Despite that, should I still re-submit the change this way?

thanks
-john