Re: [PATCH] shrink_dcache_parent() races against shrink_dcache_memory()

From: Kirill Korotaev
Date: Tue Jan 24 2006 - 04:45:33 EST


I like your idea, but some comments below... I doubt it works.
I will think it over a bit later...

Kirill
P.S. it's not easily reproducable. Before my fix it took us 3-6 hours on automated stress testing to hit this bug. Right now I can't setup it for testing, maybe in a week or so.

On Mon, Jan 23, Kirill Korotaev wrote:

[snip]

Hmm, will think about that one again. shrink_dcache_parent() and
shrink_dcache_memory()/dput() are not racing against each other now since the
reference counting is done before giving up dcache_lock and the select_parent
could start.

Regards,
Jan



I have been playing around with a possible solution to the problem.
I have not been able to reproduce this issue, hence I am unable to verify
if the patch below fixes the problem. I have run the system with this
patch and verified that no obvious badness is observed.

Kirill, Jan if you can easily reproduce the problem, could you
try this patch and review it as well for correctness of the solution?

All callers that try to free memory set the PF_MEMALLOC flag, we check
if the super block is going away due to an unmount, if so we ask the
allocator to return.

The patch adds additional cost of holding the sb_lock for each dentry
being pruned. It holds sb_lock under dentry->d_lock and dcache_lock,
I am not sure about the locking order of these locks.

Signed-off-by: Balbir Singh <balbir@xxxxxxxxxx>
---

fs/dcache.c | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+)

diff -puN fs/dcache.c~dcache_race_fix2 fs/dcache.c
--- linux-2.6/fs/dcache.c~dcache_race_fix2 2006-01-24 11:05:46.000000000 +0530
+++ linux-2.6-balbir/fs/dcache.c 2006-01-24 11:05:46.000000000 +0530
@@ -425,6 +425,29 @@ static void prune_dcache(int count)
spin_unlock(&dentry->d_lock);
continue;
}
+
+ /*
+ * Note to reviewers: our current lock order is dcache_lock,
+ * dentry->d_lock & sb_lock. Could this create a deadlock?
+ */
+ spin_lock(&sb_lock);
<<<< 1. sb_lock doesn't protect atomic_read() anyhow...
<<<< I mean, sb_lock is not required to read its value...
+ if (!atomic_read(&dentry->d_sb->s_active)) {
+ /*
+ * Race condition, umount and other pruning is happening
+ * in parallel.
+ */
+ if (current->flags & PF_MEMALLOC) {
+ /*
+ * let the allocator leave this dentry alone
+ */
+ spin_unlock(&sb_lock);
+ spin_unlock(&dentry->d_lock);
+ spin_unlock(&dcache_lock);
+ return;
<<<< you should not return, but rather 'continue'. otherwise you skip _all_ dentries, even from active super blocks.
+ }
+ }
+ spin_unlock(&sb_lock);
+
<<<< and here, when you drop sb_lock, and dentry->d_lock/dcache_lock in prune_dentry() it looks to me that we have exactly the same situation as it was without your patch:
<<<< another CPU can start umount in parallel.
<<<< maybe sb_lock barrier helps this somehow, but I can't see how yet...

<<<< another idea: down_read(&sb->s_umount) probably could help...
<<<< because it will block the whole umount operation...
<<<< but we can't take it under dcache_lock...
prune_one_dentry(dentry);
}
spin_unlock(&dcache_lock);

Thanks,
Balbir
_



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/