linux 4.2.4 rcu_sched rolls over and barfs after debugger exits

From: Jeffrey Merkey
Date: Mon Oct 26 2015 - 01:58:21 EST


After using the mdb kernel debugger then exiting, the rcu_sched, due
to its own internal timers, rolls over and crashes when it does not
get the timeout window it likes. Not caused by memory corruption,
just caused by the debugger holding the system suspended then when the
system is allowed to run rcu_sched rolls over and dies.

There are several things happening here -- lots of bugs linus ...

Jeff

sysrq: SysRq : MDB
INFO: rcu_sched detected stalls on CPUs/tasks:
(detected by 0, t=41279 jiffies, g=14721, c=14720, q=5)
All QSes seen, last rcu_sched kthread activity 41279
(-165477--206756), jiffies_till_next_fqs=3, root ->qsmask 0x0
NetworkManager R running 0 1703 1 0x00000080
c0bb6a28 c046d763 c0a895d9 00000000 000006a7 00000001 00000080 f64c1140
c0b535c0 00003981 c04a5126 c0a823a8 c0b53a91 0000a13f fffd799b fffcd85c
00000003 00000000 00000096 00000000 00003981 3b9aca00 00003981 00003980
Call Trace:
[<c046d763>] ? sched_show_task+0xb3/0x120
[<c04a5126>] ? print_other_cpu_stall+0x276/0x2c0
[<c04a52e0>] ? __rcu_pending+0x170/0x210
[<c04a632f>] ? rcu_check_callbacks+0xbf/0x1a0
[<c04a8f48>] ? update_process_times+0x28/0x50
[<c04ba943>] ? tick_sched_handle+0x33/0x70
[<c04baa97>] ? tick_sched_timer+0x47/0xa0
[<c04aaefa>] ? __remove_hrtimer+0x4a/0x90
[<c04ab656>] ? __run_hrtimer+0x66/0x180
[<c04baa50>] ? tick_nohz_handler+0xd0/0xd0
[<c055f5e5>] ? __vfs_read+0xc5/0xf0
[<c04ab7f8>] ? __hrtimer_run_queues+0x88/0xc0
[<c04ab995>] ? hrtimer_interrupt+0x85/0x170
[<c0436746>] ? local_apic_timer_interrupt+0x26/0x50
[<c0451655>] ? irq_enter+0x5/0x50
[<c043679b>] ? smp_apic_timer_interrupt+0x2b/0x50
[<c090468d>] ? apic_timer_interrupt+0x2d/0x34
[<c0900000>] ? firmware_map_add_hotplug+0x45/0x141
rcu_sched kthread starved for 41279 jiffies! g14721 c14720 f0x2
fuse init (API version 7.23)
blk_update_request: I/O error, dev fd0, sector 0
floppy: error -5 while reading block 0
blk_update_request: I/O error, dev fd0, sector 0
floppy: error -5 while reading block 0
sysrq: SysRq : MDB
INFO: rcu_sched detected stalls on CPUs/tasks:
(detected by 0, t=21939 jiffies, g=17972, c=17971, q=3)
All QSes seen, last rcu_sched kthread activity 21939
(-124010--145949), jiffies_till_next_fqs=3, root ->qsmask 0x0
rtkit-daemon R running 0 2878 1 0x00000080
c0bb6a28 c046d763 c0a895d9 00000000 00000b3e 00000001 00000080 f64c1140
c0b535c0 00004634 c04a5126 c0a823a8 c0b53a91 000055b3 fffe1b96 fffdc5e3
00000003 00000000 00000086 00000000 00004634 f69ec5cc 00004634 00004633
Call Trace:
[<c046d763>] ? sched_show_task+0xb3/0x120
[<c04a5126>] ? print_other_cpu_stall+0x276/0x2c0
[<c04a52e0>] ? __rcu_pending+0x170/0x210
[<c04a632f>] ? rcu_check_callbacks+0xbf/0x1a0
[<c04a8f48>] ? update_process_times+0x28/0x50
[<c04ba943>] ? tick_sched_handle+0x33/0x70
[<c04baa97>] ? tick_sched_timer+0x47/0xa0
[<c04aaefa>] ? __remove_hrtimer+0x4a/0x90
[<c04ab656>] ? __run_hrtimer+0x66/0x180
[<c04baa50>] ? tick_nohz_handler+0xd0/0xd0
[<c083a719>] ? __kmalloc_reserve+0x29/0x80
[<c04ab7f8>] ? __hrtimer_run_queues+0x88/0xc0
[<c04ab995>] ? hrtimer_interrupt+0x85/0x170
[<c0486507>] ? __wake_up_common+0x47/0x70
[<c0436746>] ? local_apic_timer_interrupt+0x26/0x50
[<c0451655>] ? irq_enter+0x5/0x50
[<c043679b>] ? smp_apic_timer_interrupt+0x2b/0x50
[<c090468d>] ? apic_timer_interrupt+0x2d/0x34
[<c05689b0>] ? legitimize_path+0x50/0x50
[<c056b8e5>] ? lookup_fast+0x155/0x2d0
[<c0568fbd>] ? generic_permission+0xcd/0x100
[<c056ba9a>] ? walk_component+0x3a/0x1f0
[<c08334f5>] ? SYSC_sendto+0x125/0x150
[<c056d1a6>] ? path_lookupat+0x56/0xf0
[<c056d48b>] ? filename_lookup+0x8b/0x150
[<f9cd02c2>] ? nl80211_send_bss.clone.4+0xe2/0x490 [cfg80211]
[<c056946e>] ? getname_flags+0x3e/0x1b0
[<c056948d>] ? getname_flags+0x5d/0x1b0
[<c05641fe>] ? vfs_fstatat+0x4e/0xa0
[<c0564308>] ? vfs_stat+0x18/0x20
[<c056464a>] ? SyS_stat64+0x1a/0x40
[<c0834535>] ? SyS_socketcall+0x235/0x300
[<c04da94c>] ? __audit_syscall_entry+0x9c/0x100
[<c0903b48>] ? sysenter_do_call+0x12/0x12
rcu_sched kthread starved for 21939 jiffies! g17972 c17971 f0x2
[root@aya ~]#
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/