Re: [PATCH/RFC] core/nfsd: allow kernel threads to use task_work.

From: Christian Brauner
Date: Tue Nov 28 2023 - 10:33:39 EST


On Tue, Nov 28, 2023 at 02:52:59PM +0100, Oleg Nesterov wrote:
> On 11/28, Christian Brauner wrote:
> >
> > Should be simpler if you invert the logic?
> >
> > COMPLETELY UNTESTED
>
> Agreed, this looks much better to me. But perhaps we can just add the new
> PF_KTHREAD_XXX flag and change fput
>
>
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -445,7 +445,8 @@ void fput(struct file *file)
> if (atomic_long_dec_and_test(&file->f_count)) {
> struct task_struct *task = current;
>
> - if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
> + if (likely(!in_interrupt() &&
> + task->flags & (PF_KTHREAD|PF_KTHREAD_XXX) != PF_KTHREAD) {
> init_task_work(&file->f_rcuhead, ____fput);
> if (!task_work_add(task, &file->f_rcuhead, TWA_RESUME))
> return;
>
> ?
>
> Then nfsd() can simply set PF_KTHREAD_XXX. This looks even simpler to me.

Yeah, I had played with that as well. Only reason I didn't do it was to
avoid a PF_* flag. If that's preferable it might be worth to just add
PF_TASK_WORK and decouple this from PF_KTHREAD. kthread creation and
userspace process creation are all based on the same struct
kernel_clone_args for a while now ever since we added this for clone3()
so we catch everything in copy_process():

diff --git a/fs/file_table.c b/fs/file_table.c
index 6deac386486d..5d3eb5ef4fc7 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -437,7 +437,7 @@ void fput(struct file *file)
file_free(file);
return;
}
- if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
+ if (likely(!in_interrupt() && (task->flags & PF_TASK_WORK))) {
init_task_work(&file->f_rcuhead, ____fput);
if (!task_work_add(task, &file->f_rcuhead, TWA_RESUME))
return;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 292c31697248..8dfc06acc6a0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1755,7 +1755,7 @@ extern struct pid *cad_pid;
* I am cleaning dirty pages from some other bdi. */
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* Randomize virtual address space */
-#define PF__HOLE__00800000 0x00800000
+#define PF_TASK_WORK 0x00800000
#define PF__HOLE__01000000 0x01000000
#define PF__HOLE__02000000 0x02000000
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */
diff --git a/kernel/fork.c b/kernel/fork.c
index 10917c3e1f03..2604235c800f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2346,6 +2346,14 @@ __latent_entropy struct task_struct *copy_process(
if (args->io_thread)
p->flags |= PF_IO_WORKER;

+ /*
+ * By default only non-kernel threads can use task work. Kernel
+ * threads that manage task work explicitly can add that flag in
+ * their kthread callback.
+ */
+ if (!args->kthread)
+ p->flags |= PF_TASK_WORK;
+
if (args->name)
strscpy_pad(p->comm, args->name, sizeof(p->comm));