Re: [PATCH] vfs: Fix lock inversion in drop_pagecache_sb()

From: Andrew Morton
Date: Tue Mar 25 2008 - 15:54:54 EST


On Tue, 25 Mar 2008 19:12:27 +0100
Jan Kara <jack@xxxxxxx> wrote:

> Fix longstanding lock inversion in drop_pagecache_sb by dropping inode_lock
> before calling __invalidate_mapping_pages(). We just have to make sure
> inode won't go away from under us by keeping reference to it and putting
> the reference only after we have safely resumed the scan of the inode
> list. A bit tricky but not too bad...
>
> Signed-off-by: Jan Kara <jack@xxxxxxx>
> CC: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx>
> CC: David Chinner <dgc@xxxxxxx>
>
> ---
> fs/drop_caches.c | 8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/fs/drop_caches.c b/fs/drop_caches.c
> index 59375ef..f5aae26 100644
> --- a/fs/drop_caches.c
> +++ b/fs/drop_caches.c
> @@ -14,15 +14,21 @@ int sysctl_drop_caches;
>
> static void drop_pagecache_sb(struct super_block *sb)
> {
> - struct inode *inode;
> + struct inode *inode, *toput_inode = NULL;
>
> spin_lock(&inode_lock);
> list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> if (inode->i_state & (I_FREEING|I_WILL_FREE))
> continue;

OT: it might be worth having an `if (mapping->nrpages==0) continue' here.

> + __iget(inode);
> + spin_unlock(&inode_lock);
> __invalidate_mapping_pages(inode->i_mapping, 0, -1, true);
> + iput(toput_inode);
> + toput_inode = inode;
> + spin_lock(&inode_lock);
> }
> spin_unlock(&inode_lock);
> + iput(toput_inode);
> }
>
> void drop_pagecache(void)

hrm. So we have a random ref on an inode without holding inode_lock. If
we race with invalidate_list() we end up with an inode stuck on s_inodes
and "Self-destruct in 5 seconds. Have a nice day...", don't we?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/