Re: [ghes_copy_tofrom_phys] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150

From: Will Deacon
Date: Tue Oct 31 2017 - 06:38:39 EST


On Mon, Oct 30, 2017 at 04:14:15PM -0400, Tyler Baicar wrote:
> On 10/30/2017 1:46 PM, Linus Torvalds wrote:
> >On Mon, Oct 30, 2017 at 10:20 AM, Linus Torvalds
> ><torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >>I will add a "might_sleep()" to ioremap_page_range() itself, so that
> >>we get this warning more reliably and much eailer. Right now it has
> >>been hidden by the fact that most of the time the time the page tables
> >>may be already allocated, but even then it's broken.
> >Done. It doesn't report anything for me, so _hopefully_ the GHES
> >driver is the only one that does games like this. See commit
> >b39ab98e2f47 ("Mark 'ioremap_page_range()' as possibly sleeping").
> >
> >So now it should hopefully warn about this bad usage of page remapping
> >reliably, at least if you have CONFIG_DEBUG_ATOMIC_SLEEP enabled.
> >
> >Can somebody who has a working GHES setup (although Borislav seems to
> >think no such thing exists) verify?
> Hello Linus,
>
> I have verified that this flags the error for me every time ghes_proc() is used.
> But I also see it flagged in ARM PMU code:
>
> [    7.381153] BUG: sleeping function called from invalid context at mm/slab.h:420
> [    7.387625] in_atomic(): 0, irqs_disabled(): 128, pid: 11, name: cpuhp/0
> [    7.394310] CPU: 0 PID: 11 Comm: cpuhp/0 Not tainted 4.14.0-rc7 #46
> [    7.400559] Hardware name: Qualcomm Qualcomm Centriq(TM) 2400 Development
> Platform
> [    7.414361] Call trace:
> [    7.416797] [<ffff000008088b28>] dump_backtrace+0x0/0x270
> [    7.422175] [<ffff000008088dbc>] show_stack+0x24/0x30
> [    7.427211] [<ffff0000090d01f0>] dump_stack+0x98/0xb8
> [    7.432246] [<ffff00000810118c>] ___might_sleep+0x104/0x128
> [    7.437799] [<ffff000008101208>] __might_sleep+0x58/0x90
> [    7.443097] [<ffff000008254a7c>] kmem_cache_alloc_trace+0x224/0x280
> [    7.449347] [<ffff000008e9c938>] armpmu_alloc+0x30/0x168
> [    7.454639] [<ffff000008e9d15c>] arm_pmu_acpi_cpu_starting+0x114/0x148
> [    7.461151] [<ffff0000080d0f30>] cpuhp_invoke_callback+0xb8/0x760
> [    7.467226] [<ffff0000080d1ec4>] cpuhp_thread_fun+0xa4/0x1b8
> [    7.472872] [<ffff0000080f661c>] smpboot_thread_fn+0x174/0x250
> [    7.478684] [<ffff0000080f18ec>] kthread+0x114/0x140
> [    7.483632] [<ffff000008084774>] ret_from_fork+0x10/0x1c

I know Mark was doing some fixes in the ACPI notifier code here, so I've
added him to CC.

Will