Re: [6.5-rc5 regression] core dump hangs (was Re: [Bug report] fstests generic/051 (on xfs) hang on latest linux v6.5-rc5+)

From: Jens Axboe
Date: Mon Jun 12 2023 - 13:54:08 EST


On 6/12/23 11:51?AM, Linus Torvalds wrote:
> On Mon, Jun 12, 2023 at 10:29?AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> Looks fine to me to just kill it indeed, whatever we did need this
>> for is definitely no longer the case. I _think_ we used to have
>> something in the worker exit that would potentially sleep which
>> is why we killed it before doing that, now it just looks like dead
>> code.
>
> Ok, can you (and the fsstress people) confirm that this
> whitespace-damaged patch fixes the coredump issue:
>
>
> --- a/io_uring/io-wq.c
> +++ b/io_uring/io-wq.c
> @@ -221,9 +221,6 @@ static void io_worker_exit(..
> raw_spin_unlock(&wq->lock);
> io_wq_dec_running(worker);
> worker->flags = 0;
> - preempt_disable();
> - current->flags &= ~PF_IO_WORKER;
> - preempt_enable();
>
> kfree_rcu(worker, rcu);
> io_worker_ref_put(wq);

Yep, it fixes it on my end and it passes my basic tests as well.

> Jens, I think that the two lines above there, ie the whole
>
> io_wq_dec_running(worker);
> worker->flags = 0;
>
> thing may actually be the (partial?) reason for those PF_IO_WORKER
> games. It's basically doing "now I'm doing stats by hand", and I
> wonder if now it decrements the running worker one time too much?
>
> Ie when the finally *dead* worker schedules away, never to return,
> that's when that io_wq_worker_sleeping() case triggers and decrements
> things one more time.
>
> So there might be some bookkeeping reason for those games, but it
> looks like if that's the case, then that
>
> worker->flags = 0;
>
> will have already taken care of it.
>
> I wonder if those two lines could just be removed too. But I think
> that's separate from the "let's fix the core dumping" issue.

Something like that was/is my worry. Let me add some tracing to confirm
it's fine, don't fully trust just my inspection of it. I'll send out the
patch separately once done, and then would be great if you can just pick
it up so it won't have to wait until I need to send a pull later in the
week. Particularly as I have nothing planned for 6.4 unless something
else comes up of course.

--
Jens Axboe