[BUG, PATCH] random: sleep in irq context

From: Sebastian Ott
Date: Mon Apr 23 2018 - 09:40:59 EST


Hi,

Today I observed the following lockdep splat:

irq event stamp: 36

================================
hardirqs last enabled at (35): [<00000000008e6bc6>] _raw_spin_unlock_irqrestore+0x7e/0xb0
hardirqs last disabled at (36): [<00000000008e6884>] _raw_spin_lock_irqsave+0x3c/0xb8
softirqs last enabled at (0): [<00000000001431da>] copy_process.part.8+0x42a/0x1e10
softirqs last disabled at (0): [<0000000000000000>] (null)
WARNING: inconsistent lock state
4.17.0-rc2-00001-gc50517a06b5f #186 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/4/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(ptrval) (fs_reclaim){?.+.}, at: fs_reclaim_acquire.part.13+0x0/0x60
{HARDIRQ-ON-W} state was registered at:
__lock_acquire+0x3da/0x13f0
lock_acquire+0xf6/0x290
fs_reclaim_acquire.part.13+0x52/0x60
kmem_cache_alloc_node_trace+0x52/0x408
alloc_worker+0x3e/0x88
init_rescuer.part.7+0x32/0xc8
workqueue_init+0x26a/0x388
kernel_init_freeable+0x11a/0x5a0
kernel_init+0x2a/0x158
kernel_thread_starter+0x6/0xc
kernel_thread_starter+0x0/0xc
irq event stamp: 1780
hardirqs last enabled at (1777): [<000000000010386a>] enabled_wait+0x52/0x138
hardirqs last disabled at (1778): [<00000000008e8336>] ext_int_handler+0x122/0x140
softirqs last enabled at (1780): [<000000000014f424>] irq_enter+0x8c/0x98
softirqs last disabled at (1779): [<000000000014f408>] irq_enter+0x70/0x98

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

no locks held by swapper/4/0.

stack backtrace:
CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.17.0-rc2-00001-gc50517a06b5f #186
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
([<0000000000114bbc>] show_stack+0x9c/0x108)
[<00000000008c52ca>] dump_stack+0x9a/0xd8
[<00000000001b9a64>] print_usage_bug+0x25c/0x300
[<00000000001ba1aa>] mark_lock+0x6a2/0x6f8
[<00000000001bb49c>] __lock_acquire+0xc64/0x13f0
[<00000000001bc49e>] lock_acquire+0xf6/0x290
[<00000000002edc82>] fs_reclaim_acquire.part.13+0x52/0x60
[<0000000000375e42>] __kmalloc+0x62/0x438
[<00000000006283b2>] crng_reseed+0x142/0x368
[<0000000000628998>] credit_entropy_bits+0x3c0/0x3f8
[<000000000062959c>] add_interrupt_randomness+0x204/0x298
[<00000000001d44a4>] handle_irq_event_percpu+0x54/0x88
[<00000000001d8e54>] handle_percpu_irq+0x84/0xb0
[<00000000001d2ffa>] generic_handle_irq+0x42/0x60
[<000000000010d3c4>] do_IRQ+0xa4/0xf0
[<00000000008e8350>] ext_int_handler+0x13c/0x140
[<0000000000103882>] enabled_wait+0x6a/0x138
([<000000000010386a>] enabled_wait+0x52/0x138)
[<0000000000103d4a>] arch_cpu_idle+0x32/0x50
[<00000000008e659e>] default_idle_call+0x3e/0x68
[<00000000001889aa>] do_idle+0xea/0x1c0
[<0000000000188d26>] cpu_startup_entry+0x3e/0x48
[<000000000011758e>] smp_start_secondary+0x13e/0x158
[<00000000008e87a2>] restart_int_handler+0x5a/0x70
[<0000000000000000>] (null)
INFO: lockdep is turned off.
BUG: sleeping function called from invalid context at mm/slab.h:421
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/4
INFO: lockdep is turned off.
irq event stamp: 1780
hardirqs last enabled at (1777): [<000000000010386a>] enabled_wait+0x52/0x138
hardirqs last disabled at (1778): [<00000000008e8336>] ext_int_handler+0x122/0x140
softirqs last enabled at (1780): [<000000000014f424>] irq_enter+0x8c/0x98
softirqs last disabled at (1779): [<000000000014f408>] irq_enter+0x70/0x98
Preemption disabled at:
[<00000000001174f6>] smp_start_secondary+0xa6/0x158
CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.17.0-rc2-00001-gc50517a06b5f #186
Hardware name: IBM 3906 M04 704 (LPAR)
Call Trace:
([<0000000000114bbc>] show_stack+0x9c/0x108)
[<00000000008c52ca>] dump_stack+0x9a/0xd8
[<000000000017d256>] ___might_sleep+0x216/0x278
[<000000000037607c>] __kmalloc+0x29c/0x438
[<00000000006283b2>] crng_reseed+0x142/0x368
[<0000000000628998>] credit_entropy_bits+0x3c0/0x3f8
[<000000000062959c>] add_interrupt_randomness+0x204/0x298
[<00000000001d44a4>] handle_irq_event_percpu+0x54/0x88
[<00000000001d8e54>] handle_percpu_irq+0x84/0xb0
[<00000000001d2ffa>] generic_handle_irq+0x42/0x60
[<000000000010d3c4>] do_IRQ+0xa4/0xf0
[<00000000008e8350>] ext_int_handler+0x13c/0x140
[<0000000000103882>] enabled_wait+0x6a/0x138
([<000000000010386a>] enabled_wait+0x52/0x138)
[<0000000000103d4a>] arch_cpu_idle+0x32/0x50
[<00000000008e659e>] default_idle_call+0x3e/0x68
[<00000000001889aa>] do_idle+0xea/0x1c0
[<0000000000188d26>] cpu_startup_entry+0x3e/0x48
[<000000000011758e>] smp_start_secondary+0x13e/0x158
[<00000000008e87a2>] restart_int_handler+0x5a/0x70
[<0000000000000000>] (null)
INFO: lockdep is turned off.

This happend once during boot and I could not reproduce this since, but I
think the following patch should fix the issue:

----->8