Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0 (was: Re: mm, page_alloc: avoid looking up the first zone in a zonelist twice)

From: Mel Gorman
Date: Tue May 31 2016 - 06:14:06 EST


On Tue, May 31, 2016 at 11:28:05AM +0200, Geert Uytterhoeven wrote:
> Hi Mel,
>
> On Mon, May 30, 2016 at 8:56 PM, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> > Thanks. Please try the following instead
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index bb320cde4d6d..557549c81083 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3024,6 +3024,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
> > apply_fair = false;
> > fair_skipped = false;
> > reset_alloc_batches(ac->preferred_zoneref->zone);
> > + z = ac->preferred_zoneref;
> > goto zonelist_scan;
> > }
>
> Thanks a lot, that seems to fix the issue!.
>
> Tested-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
>
> JFTR, without the fix, sometimes I get a different, but equally obscure, crash
> than the one I posted before:
>

I'm afraid I don't recognise it. Given the nature of the previous bug
though, I have a vague suspicion that someone is not handling a page
allocation failure properly and goes boom later.

--
Mel Gorman
SUSE Labs