Re: 5.0-rc kernel hangs on early boot

From: Yury Norov
Date: Wed Feb 13 2019 - 06:51:22 EST


On Wed, Feb 13, 2019 at 11:14:09AM +0000, Mel Gorman wrote:
> On Wed, Feb 13, 2019 at 11:25:40AM +0300, Yury Norov wrote:
> > Hi Mel, all,
> >
> > My kernel on qemu/arm64 setup hangs at early boot since v5.0-rc1.
> > Backtrace is not too verbose:
> > (gdb) i threads
> > Id Target Id Frame
> > * 1 Thread 1 (CPU#0 [running]) 0xffff000010a49b74 in __delay (cycles=4096)
> > at arch/arm64/lib/delay.c:49
> > 2 Thread 2 (CPU#1 [halted ]) 0x0000000000000000 in ?? ()
> > 3 Thread 3 (CPU#2 [halted ]) 0x0000000000000000 in ?? ()
> > 4 Thread 4 (CPU#3 [halted ]) 0x0000000000000000 in ?? ()
> > (gdb) bt
> > #0 0xffff000010a49b74 in __delay (cycles=4096) at arch/arm64/lib/delay.c:49
> > Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> >
> > Reverting the patch
> > 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> > fragmentation event occurs") together with following patch
> > 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
> > helps me to boot normally.
> >
>
> Well, that's a bad start to any day. Thanks for tracking it down. Does
> the following patch help? I can't test it properly as I didn't recreate
> your boot image or initrd but this appears to get past the initial boot
> phase at least.

Hi Mel,

The patch works for me. The day gets better indeed. :-)

Tested-by: Yury Norov <yury.norov@xxxxxxxxx>

Yury

> ---8<---
> mm, page_alloc: Fix a division by zero error when boosting watermarks
>
> Yury Norov reported that an arm64 KVM instance could not boot since after
> v5.0-rc1 and could addressed by reverting the patches
>
> 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
>
> The problem is that a division by zero error is possible if boosting occurs
> either very early in boot or if the high watermark is very small. This
> patch checks for the conditions and avoids boosting in those cases.
>
> Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
> Reported-by: Yury Norov <yury.norov@xxxxxxxxx>
> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>
> ---
> mm/page_alloc.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d295c9bc01a8..ae7e4ba5b9f5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2170,6 +2170,11 @@ static inline void boost_watermark(struct zone *zone)
>
> max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
> watermark_boost_factor, 10000);
> +
> + /* high watermark be be uninitialised or very small */
> + if (!max_boost)
> + return;
> +
> max_boost = max(pageblock_nr_pages, max_boost);
>
> zone->watermark_boost = min(zone->watermark_boost + pageblock_nr_pages,
>