Re: [PATCH 1/1] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

From: Oleg Nesterov
Date: Fri Jun 02 2023 - 15:24:16 EST


Hi Mike,

sorry, but somehow I can't understand this patch...

I'll try to read it with a fresh head on Weekend, but for example,

On 06/01, Mike Christie wrote:
>
> static int vhost_task_fn(void *data)
> {
> struct vhost_task *vtsk = data;
> - int ret;
> + bool dead = false;
> +
> + for (;;) {
> + bool did_work;
> +
> + /* mb paired w/ vhost_task_stop */
> + if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags))
> + break;
> +
> + if (!dead && signal_pending(current)) {
> + struct ksignal ksig;
> + /*
> + * Calling get_signal will block in SIGSTOP,
> + * or clear fatal_signal_pending, but remember
> + * what was set.
> + *
> + * This thread won't actually exit until all
> + * of the file descriptors are closed, and
> + * the release function is called.
> + */
> + dead = get_signal(&ksig);
> + if (dead)
> + clear_thread_flag(TIF_SIGPENDING);

this can't be right or I am totally confused.

Another signal_wake_up() can come right after clear(SIGPENDING).


Again, I'll try to re-read this patch, but let me ask anyway...

Do we have a plan B? I mean... iirc you have mentioned that you can
change these code paths to do something like

if (killed)
tell_the_drivers_that_all_callbacks_will_fail();


so that vhost_worker() can exit after get_signal() returns SIGKILL.

Probably I misunderstood you, but it would be nice to avoid the changes
in coredump/etc code just to add a temporary (iiuc!) fix.

Oleg.