Re: Sleeping BUG in khugepaged for i586

From: Larry Finger
Date: Thu Jun 08 2017 - 11:29:50 EST


On 06/07/2017 03:56 PM, David Rientjes wrote:
On Wed, 7 Jun 2017, Vlastimil Babka wrote:

Hmm I'd expect such spin lock to be reported together with mmap_sem in
the debugging "locks held" message?

My bisection of the problem is about half done. My latest good version is commit
7b8cd33 and the latest bad one is 2ea659a. Only about 7 steps to go.

Hmm, your bisection will most likely just find commit 338a16ba15495
which added the cond_resched() at mm/khugepaged.c:655. CCing David who
added it.


I agree it's probably going to bisect to 338a16ba15495 since it's the
cond_resched() at the line number reported, but I think there must be
something else going on. I think the list of locks held by khugepaged is
correct because it matches with the implementation. The preempt_count(),
as suggested by Andrew, does not. If this is reproducible, I'd like to
know what preempt_count() is.


The BUG output is reproducible. By the time the box finishes booting, there are at least 2 of them logged. My bisection shows that commit 338a16ba15495 is the bad one. I added a pr_info() to output the value of preempt_count() just before the cond_resched() statement. The count was always 1 whether the BUG was triggered or not.

If there are other things you would like logged at that point, or any other diagnostics, please let me know.

Larry