Suspicious RCU usage in linux-next: Bisected to commit 8d52d399

From: Larry Finger
Date: Sun Jun 14 2015 - 19:06:36 EST


When booting kernels from Linux-next, the following is output:

[ 2.816564] ===============================
[ 2.816986] [ INFO: suspicious RCU usage. ]
[ 2.817402] 4.1.0-rc7-next-20150612 #1 Not tainted
[ 2.817881] -------------------------------
[ 2.818297] kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
[ 2.819180]
other info that might help us debug this:

[ 2.819947]
rcu_scheduler_active = 1, debug_locks = 0
[ 2.820578] 3 locks held by systemd/1:
[ 2.820954] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
[ 2.821855] #1: (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
[ 2.822808] #2: (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
[ 2.823790]
stack backtrace:
[ 2.824212] CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
[ 2.824932] Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
[ 2.825751] 0000000000000001 ffff880224e07838 ffffffff817263a4 ffffffff810ccf2a
[ 2.826560] ffff880224e08000 ffff880224e07868 ffffffff810b6827 0000000000000000
[ 2.827368] ffffffff81a445d3 00000000000004f4 ffff88022682e100 ffff880224e07898
[ 2.828177] Call Trace:
[ 2.828422] [<ffffffff817263a4>] dump_stack+0x4c/0x6e
[ 2.828937] [<ffffffff810ccf2a>] ? console_unlock+0x1ca/0x510
[ 2.829514] [<ffffffff810b6827>] lockdep_rcu_suspicious+0xe7/0x120
[ 2.830139] [<ffffffff8108cf05>] ___might_sleep+0x1d5/0x1f0
[ 2.830699] [<ffffffff8108cf6d>] __might_sleep+0x4d/0x90
[ 2.831239] [<ffffffff811f3789>] ? create_object+0x39/0x2e0
[ 2.831800] [<ffffffff811da427>] kmem_cache_alloc+0x47/0x250
[ 2.832375] [<ffffffff813c19ae>] ? find_next_zero_bit+0x1e/0x20
[ 2.832973] [<ffffffff811f3789>] create_object+0x39/0x2e0
[ 2.833515] [<ffffffff810b7eb6>] ? mark_held_locks+0x66/0x90
[ 2.834089] [<ffffffff8172efab>] ? _raw_spin_unlock_irqrestore+0x4b/0x60
[ 2.834761] [<ffffffff817193c1>] kmemleak_alloc_percpu+0x61/0xe0
[ 2.835369] [<ffffffff811a26f0>] pcpu_alloc+0x370/0x630
[ 2.835900] [<ffffffff815e8601>] ? dst_ifdown+0x41/0x90
[ 2.836425] [<ffffffff811a29c2>] __alloc_percpu_gfp+0x12/0x20
[ 2.837008] [<ffffffff816ace20>] ip6_dst_alloc.isra.41+0x30/0xa0
[ 2.837610] [<ffffffff816b195d>] addrconf_dst_alloc+0x3d/0xf0
[ 2.838191] [<ffffffff816a36fc>] ipv6_add_addr+0x27c/0x540
[ 2.838743] [<ffffffff816a34e2>] ? ipv6_add_addr+0x62/0x540
[ 2.839307] [<ffffffff816a810b>] inet6_addr_add+0x11b/0x260
[ 2.839872] [<ffffffff816a8873>] inet6_rtm_newaddr+0x343/0x450
[ 2.840457] [<ffffffff810b96cd>] ? __lock_acquire+0x53d/0x1510
[ 2.841048] [<ffffffff815f0d45>] rtnetlink_rcv_msg+0x95/0x240
[ 2.841625] [<ffffffff810b80ed>] ? trace_hardirqs_on+0xd/0x10
[ 2.860830] [<ffffffff815f0c8f>] ? rtnetlink_rcv+0x1f/0x40
[ 2.879948] [<ffffffff815f0cb0>] ? rtnetlink_rcv+0x40/0x40
[ 2.898849] [<ffffffff8161464f>] netlink_rcv_skb+0xaf/0xc0
[ 2.917687] [<ffffffff815f0c9e>] rtnetlink_rcv+0x2e/0x40
[ 2.936468] [<ffffffff8161404c>] netlink_unicast+0x14c/0x1f0
[ 2.955266] [<ffffffff81614410>] netlink_sendmsg+0x320/0x3a0
[ 2.973517] [<ffffffff811ac19d>] ? __might_fault+0x4d/0xa0
[ 2.991353] [<ffffffff815c0358>] sock_sendmsg+0x38/0x50
[ 3.009172] [<ffffffff815c07bf>] SYSC_sendto+0xef/0x170
[ 3.026925] [<ffffffff810b4853>] ? up_write+0x23/0x50
[ 3.045032] [<ffffffff81001044>] ? lockdep_sys_exit_thunk+0x12/0x14
[ 3.063189] [<ffffffff815c18ae>] SyS_sendto+0xe/0x10
[ 3.081227] [<ffffffff8172f817>] entry_SYSCALL_64_fastpath+0x12/0x6f


The above splat is followed by several "BUG: sleeping function called from invalid context at mm/slub.c:1268" messages, but these are probably secondary.

The most recent commit was d9b5ec5b1b4d4055e256674de4a5337f6a66d284.

This problem has been bisected to the following:

commit d52d3997f843ffefaa8d8462790ffcaca6c74192
Author: Martin KaFai Lau <kafai@xxxxxx>
Date: Fri May 22 20:56:06 2015 -0700

ipv6: Create percpu rt6_info

After the patch
'ipv6: Only create RTF_CACHE routes after encountering pmtu exception',
we need to compensate the performance hit (bouncing dst->__refcnt).

Signed-off-by: Martin KaFai Lau <kafai@xxxxxx>
Cc: Hannes Frederic Sowa <hannes@xxxxxxxxxxxxxxxxxxx>
Cc: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
Cc: Julian Anastasov <ja@xxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>

I will be happy to test any suggested patches.

Larry

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/