Re: Sleeping BUG in khugepaged for i586

From: Michal Hocko
Date: Mon Jun 12 2017 - 02:29:27 EST


On Sun 11-06-17 16:28:11, David Rientjes wrote:
> On Sat, 10 Jun 2017, Michal Hocko wrote:
>
> > > > I would just pull the cond_resched out of __collapse_huge_page_copy
> > > > right after pte_unmap. But I am not really sure why this cond_resched is
> > > > really needed because the changelog of the patch which adds is is quite
> > > > terse on details.
> > >
> > > I'm not sure what could possibly be added to the changelog. We have
> > > encountered need_resched warnings during the iteration.
> >
> > Well, the part the changelog is not really clear about is whether the
> > HPAGE_PMD_NR loops itself is the source of the stall. This would be
> > quite surprising because doing 512 iterations taking up to 20+s sounds
> > way to much.
>
> I have no idea where you come up with 20+ seconds.

OK, I misread your report as a soft lockup.

> These are not soft lockups, these are need_resched warnings. We monitor
> how long need_resched has been set and when a thread takes an excessive
> amount of time to reschedule after it has been set. A loop of 512 pages
> with ptl contention and doing {clear,copy}_user_highpage() shows that
> need_resched can sit without scheduling for an excessive amount of time.

How much is excessive here?
--
Michal Hocko
SUSE Labs