Re: [PATCH] mm, vmstat: Allow WQ concurrency to discover memory reclaim doesn't make any progress

From: Tetsuo Handa
Date: Wed Nov 25 2015 - 06:54:20 EST


Michal Hocko wrote:
> Anyway I think that the issue is not solely theoretical. WQ_MEM_RECLAIM
> is simply not working if the allocation path doesn't sleep currently and
> my understanding of what Tejun claims [2] is that that reimplementing WQ
> concurrency would be too intrusive and lacks sufficient justification
> because other kernel paths do sleep. This patch tries to reduce the
> sleep only to worker threads which should not cause any problems to
> regular tasks.

I received many unexplained hangup/reboot reports from customers when I was
working at support center. But we can't answer whether real people ever hit
this problem because we have no watchdog for memory allocation stalls.
I want one like http://lkml.kernel.org/r/201511250024.AAE78692.QVOtFFOSFOMLJH@xxxxxxxxxxxxxxxxxxx
as I wrote off-list ( "mm,oom: The reason why I continue proposing timeout
based approach." ). It will help with judging when we tackle TIF_MEMDIE
livelock problem.

What I can say is that RHEL6 (a 2.6.32-based distro) backported the
wait_iff_congested() changes and therefore people might really hit
this problem.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/