[patch] mm: vmscan: clear kswapd's special reclaim powers before exiting

From: Johannes Weiner
Date: Thu Jun 05 2014 - 08:37:01 EST


When kswapd exits, it can end up taking locks that were previously
held by allocating tasks while they waited for reclaim. Lockdep
currently warns about this:

On Wed, May 28, 2014 at 06:06:34PM +0800, Gu Zheng wrote:
> [ 2457.683370] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
> [ 2457.761540] kswapd2/1151 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 2457.824102] (&sig->group_rwsem){+++++?}, at: [<ffffffff81071864>] exit_signals+0x24/0x130
> [ 2457.923538] {RECLAIM_FS-ON-W} state was registered at:
> [ 2457.985055] [<ffffffff810bfc99>] mark_held_locks+0xb9/0x140
> [ 2458.053976] [<ffffffff810c1e3a>] lockdep_trace_alloc+0x7a/0xe0
> [ 2458.126015] [<ffffffff81194f47>] kmem_cache_alloc_trace+0x37/0x240
> [ 2458.202214] [<ffffffff812c6e89>] flex_array_alloc+0x99/0x1a0
> [ 2458.272175] [<ffffffff810da563>] cgroup_attach_task+0x63/0x430
> [ 2458.344214] [<ffffffff810dcca0>] attach_task_by_pid+0x210/0x280
> [ 2458.417294] [<ffffffff810dcd26>] cgroup_procs_write+0x16/0x20
> [ 2458.488287] [<ffffffff810d8410>] cgroup_file_write+0x120/0x2c0
> [ 2458.560320] [<ffffffff811b21a0>] vfs_write+0xc0/0x1f0
> [ 2458.622994] [<ffffffff811b2bac>] SyS_write+0x4c/0xa0
> [ 2458.684618] [<ffffffff815ec3c0>] tracesys+0xdd/0xe2
> [ 2458.745214] irq event stamp: 49
> [ 2458.782794] hardirqs last enabled at (49): [<ffffffff815e2b56>] _raw_spin_unlock_irqrestore+0x36/0x70
> [ 2458.894388] hardirqs last disabled at (48): [<ffffffff815e337b>] _raw_spin_lock_irqsave+0x2b/0xa0
> [ 2459.000771] softirqs last enabled at (0): [<ffffffff81059247>] copy_process.part.24+0x627/0x15f0
> [ 2459.107161] softirqs last disabled at (0): [< (null)>] (null)
> [ 2459.195852]
> [ 2459.195852] other info that might help us debug this:
> [ 2459.274024] Possible unsafe locking scenario:
> [ 2459.274024]
> [ 2459.344911] CPU0
> [ 2459.374161] ----
> [ 2459.403408] lock(&sig->group_rwsem);
> [ 2459.448490] <Interrupt>
> [ 2459.479825] lock(&sig->group_rwsem);
> [ 2459.526979]
> [ 2459.526979] *** DEADLOCK ***
> [ 2459.526979]
> [ 2459.597866] no locks held by kswapd2/1151.
> [ 2459.646896]
> [ 2459.646896] stack backtrace:
> [ 2459.699049] CPU: 30 PID: 1151 Comm: kswapd2 Not tainted 3.10.39+ #4
> [ 2459.774098] Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 01.48 05/07/2014
> [ 2459.895983] ffffffff82284bf0 ffff88085856bbf8 ffffffff815dbcf6 ffff88085856bc48
> [ 2459.985003] ffffffff815d67c6 0000000000000000 ffff880800000001 ffff880800000001
> [ 2460.074024] 000000000000000a ffff88085edc9600 ffffffff810be0e0 0000000000000009
> [ 2460.163087] Call Trace:
> [ 2460.192345] [<ffffffff815dbcf6>] dump_stack+0x19/0x1b
> [ 2460.253874] [<ffffffff815d67c6>] print_usage_bug+0x1f7/0x208
> [ 2460.399807] [<ffffffff810bfb5d>] mark_lock+0x21d/0x2a0
> [ 2460.462369] [<ffffffff810c076a>] __lock_acquire+0x52a/0xb60
> [ 2460.735516] [<ffffffff810c1592>] lock_acquire+0xa2/0x140
> [ 2460.935691] [<ffffffff815e01e1>] down_read+0x51/0xa0
> [ 2461.062888] [<ffffffff81071864>] exit_signals+0x24/0x130
> [ 2461.127536] [<ffffffff81060d55>] do_exit+0xb5/0xa50
> [ 2461.320433] [<ffffffff8108303b>] kthread+0xdb/0x100
> [ 2461.532049] [<ffffffff815ec0ec>] ret_from_fork+0x7c/0xb0

This is because the kswapd thread is still marked as a reclaimer at
the time of exit. But because it is exiting, nobody is actually
waiting on it to make reclaim progress anymore, and it's nothing but a
regular thread at this point. Be tidy and strip it of all its powers
(PF_MEMALLOC, PF_SWAPWRITE, PF_KSWAPD, and the lockdep reclaim state)
before returning from the thread function.

Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
---
mm/vmscan.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9a63d13739a6..4ac2eab860d2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3425,7 +3425,10 @@ static int kswapd(void *p)
}
}

+ tsk->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD);
current->reclaim_state = NULL;
+ lockdep_clear_current_reclaim_state();
+
return 0;
}

--
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/