Re: [PATCH 1/2 v2] io_uring: Fix race condition when sqp thread goes to sleep

From: Olivier Langlois
Date: Tue Jun 22 2021 - 18:37:18 EST


On Tue, 2021-06-22 at 21:45 +0100, Pavel Begunkov wrote:
> On 6/22/21 7:55 PM, Olivier Langlois wrote:
> > If an asynchronous completion happens before the task is preparing
> > itself to wait and set its state to TASK_INTERRUPTIBLE, the
> > completion
> > will not wake up the sqp thread.
> >
> > Signed-off-by: Olivier Langlois <olivier@xxxxxxxxxxxxxx>
> > ---
> >  fs/io_uring.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/io_uring.c b/fs/io_uring.c
> > index fc8637f591a6..02f789e07d4c 100644
> > --- a/fs/io_uring.c
> > +++ b/fs/io_uring.c
> > @@ -6902,7 +6902,7 @@ static int io_sq_thread(void *data)
> >                 }
> >  
> >                 prepare_to_wait(&sqd->wait, &wait,
> > TASK_INTERRUPTIBLE);
> > -               if (!io_sqd_events_pending(sqd)) {
> > +               if (!io_sqd_events_pending(sqd) && !current-
> > >task_works) {
>
> Agree that it should be here, but we also lack a good enough
> task_work_run() around, and that may send the task burn CPU
> for a while in some cases. Let's do
>
> if (!io_sqd_events_pending(sqd) && !io_run_task_work())
>    ...

I can do that if you want but considering that the function is inline
and the race condition is a relatively rare occurence, is the cost
coming with inline expansion really worth it in this case?
>
> fwiw, no need to worry about TASK_INTERRUPTIBLE as
> io_run_task_work() sets it to TASK_RUNNING.

I wasn't worried about that as I believe that finish_wait() is taking
care the state as well.

What I wasn't sure about was if the patch was sufficient to totally
eliminate the race condition.

I had to educate myself about how schedule() works to appreciate its
design and convince myself that the patch was good.
>
> >                         needs_sched = true;
> >                         list_for_each_entry(ctx, &sqd->ctx_list,
> > sqd_list) {
> >                                 io_ring_set_wakeup_flag(ctx);
> >
>