Re: [RFC 1/3] oom, sysrq: Skip over oom victims and killed tasks

From: David Rientjes
Date: Tue Jan 19 2016 - 17:57:47 EST


On Fri, 15 Jan 2016, Michal Hocko wrote:

> > I think it's time to kill sysrq+F and I'll send those two patches
> > unless there is a usecase I'm not aware of.
>
> I have described one in the part you haven't quoted here. Let me repeat:
> : Your system might be trashing to the point you are not able to log in
> : and resolve the situation in a reasonable time yet you are still not
> : OOM. sysrq+f is your only choice then.
>
> Could you clarify why it is better to ditch a potentially usefull
> emergency tool rather than to make it work reliably and predictably?

I'm concerned about your usecase where the kernel requires admin
intervention to resolve such an issue and there is nothing in the VM we
can do to fix it.

If you have a specific test that demonstrates when your usecase is needed,
please provide it so we can address the issue that it triggers. I'd
prefer to fix the issue in the VM rather than require human intervention,
especially when we try to keep a very large number of machines running in
our datacenters.