Re: [ghes_copy_tofrom_phys] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150

From: Will Deacon
Date: Mon Oct 30 2017 - 13:49:55 EST


On Mon, Oct 30, 2017 at 10:46:31AM -0700, Linus Torvalds wrote:
> On Mon, Oct 30, 2017 at 10:20 AM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > I will add a "might_sleep()" to ioremap_page_range() itself, so that
> > we get this warning more reliably and much eailer. Right now it has
> > been hidden by the fact that most of the time the time the page tables
> > may be already allocated, but even then it's broken.
>
> Done. It doesn't report anything for me, so _hopefully_ the GHES
> driver is the only one that does games like this. See commit
> b39ab98e2f47 ("Mark 'ioremap_page_range()' as possibly sleeping").
>
> So now it should hopefully warn about this bad usage of page remapping
> reliably, at least if you have CONFIG_DEBUG_ATOMIC_SLEEP enabled.
>
> Can somebody who has a working GHES setup (although Borislav seems to
> think no such thing exists) verify?
>
> This obviously won't _fix_ anything, but at least it should make it
> clear it's not that recent change that broke things - that just
> happened to expose it. And hopefully somebody who knows that driver
> will do the proper fixmap thing (or just ioremap once at probe time,
> rather than at run-time).

FWIW, we discussed some of this back in 2015, because the TLB invalidation
looks busted to me too:

https://marc.info/?l=linux-kernel&m=145009681808308&w=2

Didn't go anywhere though...

Will