[PATCH] fs: process fput task_work with TWA_SIGNAL

From: Jens Axboe
Date: Tue Jan 05 2021 - 13:30:11 EST


Song reported a boot regression in a kvm image with 5.11-rc, and bisected
it down to the below patch. Debugging this issue, turns out that the boot
stalled when a task is waiting on a pipe being released. As we no longer
run task_work from get_signal() unless it's queued with TWA_SIGNAL, the
task goes idle without running the task_work. This prevents ->release()
from being called on the pipe, which another boot task is waiting on.

Use TWA_SIGNAL for the file fput work to ensure it's run before the task
goes idle.

Fixes: 98b89b649fce ("signal: kill JOBCTL_TASK_WORK")
Reported-by: Song Liu <songliubraving@xxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

---

The other alternative here is obviously to re-instate the:

if (unlikely(current->task_works))
task_work_run();

in get_signal() that we had before this change. Might be safer in case
there are other cases that need to ensure the work is run in a timely
fashion, though I do think it's cleaner to long term to correctly mark
task_work with the needed notification type. Comments welcome...

diff --git a/fs/file_table.c b/fs/file_table.c
index 45437f8e1003..7c76b611c95b 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -338,7 +338,13 @@ void fput_many(struct file *file, unsigned int refs)

if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
init_task_work(&file->f_u.fu_rcuhead, ____fput);
- if (!task_work_add(task, &file->f_u.fu_rcuhead, TWA_RESUME))
+ /*
+ * We could be dependent on the fput task_work running,
+ * eg for pipes where someone is waiting on release
+ * being called. Use TWA_SIGNAL to ensure it's run
+ * before the task goes idle.
+ */
+ if (!task_work_add(task, &file->f_u.fu_rcuhead, TWA_SIGNAL))
return;
/*
* After this task has run exit_task_work(),

--
Jens Axboe