[GIT PULL] fork: user workers & vhost

From: Christian Brauner
Date: Fri Apr 21 2023 - 09:38:00 EST


Hey Linus,

/* Summary */
This contains the work generalizing the ability to create a kernel
worker from a userspace process. Such user workers will run with the
same credentials as the userspace process they were created from
providing stronger security and accounting guarantees than the
traditional override_creds() approach ever could've hoped for.

The original work was heavily based and optimzed for the needs of
io_uring which was the first user. However, as it quickly turned out the
ability to create user workers inherting properties from a userspace
process is generally useful.

The vhost subsystem currently creates workers using the kthread api. The
consequences of using the kthread api are that RLIMITs don't work
correctly as they are inherited from khtreadd. This leads to bugs
where more workers are created than would be allowed by the RLIMITs of
the userspace process in lieu of which workers are created.

Problems like this disappear with user workers created from the
userspace processes for which they perform the work. In addition,
providing this api allows vhost to remove additional complexity. For
example, cgroup and mm sharing will just work out of the box with user
workers based on the relevant userspace process instead of manually
ensuring the correct cgroup and mm contexts are used.

So the vhost subsystem should simply be made to use the same mechanism
as io_uring. To this end the original mechanism used for
create_io_thread() is generalized into user workers:

* Introduce PF_USER_WORKER as a generic indicator that a given task is a
user worker, i.e., a kernel task that was created from a userspace
process. Now a PF_IO_WORKER thread is just a specialized version of
PF_USER_WORKER. So io_uring io workers raise both flags.
* Make copy_process() available to core kernel code.
* Extend struct kernel_clone_args with the following bitfields allowing
to indicate to copy_process():
* to create a user worker (raise PF_USER_WORKER)
* to not inherit any files from the userspace process
* to ignore signals

After all generic changes are in place the vhost subsystem implements a
new dedicated vhost api based on user workers. Finally, vhost is
switched to rely on the new api moving it off of kthreads.

Thanks to Mike for sticking it out and making it through this rather
arduous journey.

/* Testing */
clang: Ubuntu clang version 15.0.6
gcc: (Ubuntu 12.2.0-3ubuntu1) 12.2.0

All patches are based on 6.3-rc1 and have been sitting in linux-next.
No build failures or warnings were observed. All old and new tests in
fstests, selftests, and LTP pass without regressions.

/* Conflicts */
At the time of creating this PR no merge conflicts were reported from
linux-next and no merge conflicts showed up doing a test-merge with
current mainline.

The following changes since commit fe15c26ee26efa11741a7b632e9f23b01aca4cc6:

Linux 6.3-rc1 (2023-03-05 14:52:03 -0800)

are available in the Git repository at:

git@xxxxxxxxxxxxxxxxxxx:pub/scm/linux/kernel/git/brauner/linux tags/v6.4/kernel.user_worker

for you to fetch changes up to 6e890c5d5021ca7e69bbe203fde42447874d9a82:

vhost: use vhost_tasks for worker threads (2023-03-23 12:45:37 +0100)

Please consider pulling these changes from the signed v6.4/kernel.user_worker tag.

Thanks!
Christian

----------------------------------------------------------------
v6.4/kernel.user_worker

----------------------------------------------------------------
Mike Christie (11):
csky: Remove kernel_thread declaration
kernel: Allow a kernel thread's name to be set in copy_process
kthread: Pass in the thread's name during creation
kernel: Make io_thread and kthread bit fields
fork/vm: Move common PF_IO_WORKER behavior to new flag
fork: add kernel_clone_args flag to not dup/clone files
fork: Add kernel_clone_args flag to ignore signals
fork: allow kernel code to call copy_process
vhost_task: Allow vhost layer to use copy_process
vhost: move worker thread fields to new struct
vhost: use vhost_tasks for worker threads

MAINTAINERS | 2 +
arch/csky/include/asm/processor.h | 2 -
drivers/vhost/Kconfig | 5 ++
drivers/vhost/vhost.c | 124 ++++++++++++++++++--------------------
drivers/vhost/vhost.h | 11 +++-
include/linux/sched.h | 2 +-
include/linux/sched/task.h | 13 +++-
include/linux/sched/vhost_task.h | 23 +++++++
init/main.c | 2 +-
kernel/Makefile | 1 +
kernel/fork.c | 25 ++++++--
kernel/kthread.c | 33 ++++------
kernel/vhost_task.c | 117 +++++++++++++++++++++++++++++++++++
mm/vmscan.c | 4 +-
14 files changed, 263 insertions(+), 101 deletions(-)
create mode 100644 include/linux/sched/vhost_task.h
create mode 100644 kernel/vhost_task.c