Re: [PATCH] exit: fix a race in release_task when flushing the dentry

From: Wen Yang
Date: Sun Nov 29 2020 - 09:45:07 EST




在 2020/11/29 下午2:05, Greg Kroah-Hartman 写道:
On Sat, Nov 28, 2020 at 11:28:53PM +0800, Wen Yang wrote:


在 2020/11/28 下午10:05, Greg Kroah-Hartman 写道:
On Sat, Nov 28, 2020 at 09:59:09PM +0800, Wen Yang wrote:


在 2020/11/28 下午4:06, Greg Kroah-Hartman 写道:
On Sat, Nov 28, 2020 at 02:47:22PM +0800, Wen Yang wrote:
[ Upstream commit 7bc3e6e55acf065500a24621f3b313e7e5998acf ]

No, that is not this commit at all.

What are you wanting to have happen here?

confused,

greg k-h


Thanks.
Let's explain it briefly:

The dentries such as /proc/<pid>/ns/ipc have the DCACHE_OP_DELETE flag, they
should be deleted when the process exits.
Suppose the following race appears:

release_task dput
-> proc_flush_task
-> dentry->d_op->d_delete(dentry)
-> __exit_signal
-> dentry->d_lockref.count-- and return.


In the proc_flush_task function, because another processe is using this
dentry, it cannot be deleted;
In the dput function, d_delete may be executed before __exit_signal (the pid
has not been unhashed), so that d_delete returns false and the dentry can
not be deleted.

So this dentry is still caches (count is 0), and its parent dentries are
also caches, and those dentries can only be deleted when drop_caches is
manually triggered.


In the release_task function, we should move proc_flush_task after the
tasklist_lock is released(Just like the commit
7bc3e6e55acf065500a24621f3b313e7e5998acf did).

I do not understand, is this a patch being submitted for the main kernel
tree, or for a stable kernel release?

If stable, please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

If main kernel tree, you can't have the "Upstream commit" line in the
changelog text as that makes no sense at all.


Hi,
This patch is submitted to the stable branches (from 4.9.y
to 5.6.y).

This problem can also be solved if the following patch could be ported to
the stable branch:
7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
26dbc60f385f ("proc: Generalize proc_sys_prune_dcache into
proc_prune_siblings_dcache")
f90f3cafe8d5 ("proc: Use d_invalidate in proc_prune_siblings_dcache")

However, the above-mentioned patches modify too much code (more than 100
lines), and there may also be some undiscovered bugs.

So the safer method may be to apply this small patch(also ported from the
equivalent fix already exist in Linus’ tree).

We will reformat the patch later.

We always prefer to take the original, upstream patches, instead of
one-off changes as almost always, those one-off changes end up being
wrong and hard to work with over time.

So if we need more than one patch to solve this reported problem, that's
fine, can you test the above series of patches and provide a backported
set of them that we can use for this?


Ok, we will follow your suggestions.
Thanks.

--
Best wishes,
Wen