Re: [PATCH 1/1] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

From: Oleg Nesterov
Date: Mon Jun 05 2023 - 09:27:37 EST


On 06/02, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@xxxxxxxxxx> writes:
>
> > Hi Mike,
> >
> > sorry, but somehow I can't understand this patch...
> >
> > I'll try to read it with a fresh head on Weekend, but for example,
> >
> > On 06/01, Mike Christie wrote:
> >>
> >> static int vhost_task_fn(void *data)
> >> {
> >> struct vhost_task *vtsk = data;
> >> - int ret;
> >> + bool dead = false;
> >> +
> >> + for (;;) {
> >> + bool did_work;
> >> +
> >> + /* mb paired w/ vhost_task_stop */
> >> + if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags))
> >> + break;
> >> +
> >> + if (!dead && signal_pending(current)) {
> >> + struct ksignal ksig;
> >> + /*
> >> + * Calling get_signal will block in SIGSTOP,
> >> + * or clear fatal_signal_pending, but remember
> >> + * what was set.
> >> + *
> >> + * This thread won't actually exit until all
> >> + * of the file descriptors are closed, and
> >> + * the release function is called.
> >> + */
> >> + dead = get_signal(&ksig);
> >> + if (dead)
> >> + clear_thread_flag(TIF_SIGPENDING);
> >
> > this can't be right or I am totally confused.
> >
> > Another signal_wake_up() can come right after clear(SIGPENDING).
>
> Technically yes.

...

> Beyond that clearing TIF_SIGPENDING is just an optimization so
> the thread can sleep in schedule and not spin.

Yes. So if another signal_wake_up() comes after clear(SIGPENDING)
this code will spin in busy-wait loop waiting VHOST_TASK_FLAGS_STOP.
Obviously not good and even deadlockable on UP && !PREEMPT.

Oleg.