Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

From: Eric W. Biederman
Date: Tue May 23 2023 - 11:40:10 EST


"Michael S. Tsirkin" <mst@xxxxxxxxxx> writes:

> On Sun, May 21, 2023 at 09:51:24PM -0500, Mike Christie wrote:
>> When switching from kthreads to vhost_tasks two bugs were added:
>> 1. The vhost worker tasks's now show up as processes so scripts doing ps
>> or ps a would not incorrectly detect the vhost task as another process.
>> 2. kthreads disabled freeze by setting PF_NOFREEZE, but vhost tasks's
>> didn't disable or add support for them.
>>
>> To fix both bugs, this switches the vhost task to be thread in the
>> process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
>> get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
>> SIGKILL/STOP support is required because CLONE_THREAD requires
>> CLONE_SIGHAND which requires those 2 signals to be suppported.
>>
>> This a modified version of patch originally written by Linus which
>> handles his review comment to himself to rename ignore_signals to
>> block_signals to better represent what it now does. And it includes a
>> change to vhost_worker() to support SIGSTOP/KILL and freeze, and it
>> drops the wait use per Oleg's review comment that it's no longer needed
>> when using CLONE_THREAD.
>>
>> Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
>> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
>> ---
>> drivers/vhost/vhost.c | 17 ++++++++++++++++-
>> include/linux/sched/task.h | 2 +-
>> kernel/fork.c | 12 +++---------
>> kernel/signal.c | 1 +
>> kernel/vhost_task.c | 16 ++++------------
>> 5 files changed, 25 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index a92af08e7864..bf83e9340e40 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -338,6 +338,7 @@ static int vhost_worker(void *data)
>> struct vhost_worker *worker = data;
>> struct vhost_work *work, *work_next;
>> struct llist_node *node;
>> + bool dead = false;
>>
>> for (;;) {
>> /* mb paired w/ kthread_stop */
>> @@ -349,8 +350,22 @@ static int vhost_worker(void *data)
>> }
>>
>> node = llist_del_all(&worker->work_list);
>> - if (!node)
>> + if (!node) {
>> schedule();
>> + /*
>> + * When we get a SIGKILL our release function will
>> + * be called. That will stop new IOs from being queued
>> + * and check for outstanding cmd responses. It will then
>> + * call vhost_task_stop to tell us to return and exit.
>> + */
>> + if (!dead && signal_pending(current)) {
>> + struct ksignal ksig;
>> +
>> + dead = get_signal(&ksig);
>> + if (dead)
>> + clear_thread_flag(TIF_SIGPENDING);
>
>
> Does get_signal actually return true only on SIGKILL then?

get_signal returns the signal that was dequeued, or 0 if no signal was
dequeued.

For these threads that block all but SIGSTOP and SIGKILL get_signal
should properly never return any signal. As SIGSTOP and SIGKILL are
handled internally to get_signal.

However get_signal was modified to have a special case for io_uring
and now the vhost code so that extra cleanup can be performed, before
do_exit is called on the thread. That special case causes get_signal
to return when with the value of SIGKILL when the process exits.

The process can exit with do_group_exit aka exit(2) or any fatal signal
not just SIGKILL, and get_signal on these threads will return.

Eric