Re: [PATCH v11 8/8] vhost: use vhost_tasks for worker threads

From: Christian Brauner
Date: Mon May 15 2023 - 10:23:59 EST


On Sat, May 13, 2023 at 10:08:04AM -0500, Linus Torvalds wrote:
> On Sat, May 13, 2023 at 7:39 AM Thorsten Leemhuis <linux@xxxxxxxxxxxxx> wrote:
> >
> > Jumping in here, as I found another problem with that patch: it broke
> > s2idle on my laptop when a qemu-kvm VM is running, as freezing user
> > space processes now fails for me:
>
> Hmm. kthreads have PF_NOFREEZE by default, which is probably the reason.
>
> Adding
>
> current->flags |= PF_NOFREEZE;
>
> to the vhost_task setup might just fix it, but it feels a bit off.
>
> The way io_uring does this is to do
>
> if (signal_pending(current)) {
> struct ksignal ksig;
>
> if (!get_signal(&ksig))
> continue;
> break;
> }
>
> in the main loop, which ends up handling the freezer situation too.
> But it should handle things like SIGSTOP etc as well, and also exit on
> actual signals.
>
> I get the feeling that the whole "vhost_task_should_stop()" logic
> should have the exact logic above, and basically make those threads
> killable as well.
>
> Hmm?

I'm still trying to catch up after LSFMM with everything that's happened
on the fs side so coming back to this thread with a fresh set of eyes is
difficult. Sorry about the delay here.

So we seem to two immediate issues:
(1) The current logic breaks ps output because vhost creates helper
processes instead of threads. The suggested patch by Mike was to
make them proper threads again but somehow special threads in the
sense that they don't unshare signal handlers. The latter part is
possibly broken and seems hacky. (That's earlier in the thread.)
(2) Freezing of vhost tasks fails. (This mail.)

So I think we will be able to address (1) and (2) by making vhost tasks
proper threads and blocking every signal except for SIGKILL and SIGSTOP
and then having vhost handle get_signal() - as you mentioned - the same
way io uring already does. We should also remove the ingore_signals
thing completely imho. I don't think we ever want to do this with user
workers.

@Mike, can you get a patch ready ideally this week so we can get this
fixed soon?