Re: [syzbot] WARNING: locking bug in umh_complete

From: Peter Zijlstra
Date: Fri Feb 03 2023 - 07:01:22 EST


On Fri, Feb 03, 2023 at 07:22:43PM +0900, Tetsuo Handa wrote:
> On 2023/01/27 10:41, Hillf Danton wrote:

> >> Call Trace:
> >> <TASK>
> >> lock_acquire kernel/locking/lockdep.c:5668 [inline]
> >> lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
> >> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> >> _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
> >> complete+0x1d/0x1f0 kernel/sched/completion.c:32
> >> umh_complete+0x32/0x90 kernel/umh.c:59
> >> call_usermodehelper_exec_sync kernel/umh.c:144 [inline]
> >> call_usermodehelper_exec_work+0x115/0x180 kernel/umh.c:167
> >> process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
> >> worker_thread+0x669/0x1090 kernel/workqueue.c:2436
> >> kthread+0x2e8/0x3a0 kernel/kthread.c:376
> >> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
> >> </TASK>
> >
> > This is an interesting case - given done initialized on stack, no garbage
> > should have been detected by lockdep.
> >
> > One explanation to the report is uaf on the waker side, and it can be
> > tested with the diff below when a reproducer is available.
> >
> > Hillf
> >
> > --- a/kernel/umh.c
> > +++ b/kernel/umh.c
> > @@ -452,6 +452,12 @@ int call_usermodehelper_exec(struct subp
> > /* umh_complete() will see NULL and free sub_info */
> > if (xchg(&sub_info->complete, NULL))
> > goto unlock;
> > + else {
> > + /* wait for umh_complete() to finish in a bid to avoid
> > + * uaf because done is destructed
> > + */

Invalid comment style at the very least.

> > + wait_for_completion(&done);
> > + }
> > }
> >
> > wait_done:
> > --
>
> Yes, this bug is caused by commit f5d39b020809 ("freezer,sched: Rewrite core freezer
> logic"), for that commit for unknown reason omits wait_for_completion(&done) call
> when wait_for_completion_state(&done, state) returned -ERESTARTSYS.
>
> Peter, is it safe to restore wait_for_completion(&done) call?

Urgh, that code is terrible.. the way I read it was that it would
wait_for_completion_killable() if KILLABLE and assumed the
second wait_for_completion() would NOP out because we'd already
completed on the first.

I don't see how adding a second wait is correct in the case of
-ERESTARTSYS, what's the stop this second wait to also get interrupted
like that?

Should there be a loop?