Re: [RFC PATCH] watchdog: Adding softwatchdog

From: Tetsuo Handa
Date: Sat Apr 24 2021 - 11:28:25 EST


On 2021/04/24 23:41, Guenter Roeck wrote:
> On 4/24/21 3:25 AM, Peter Enderborg wrote:
>> This is not a rebooting watchdog. It's function is to take other
>> actions than a hard reboot. On many complex system there is some
>> kind of manager that monitor and take action on slow systems.
>> Android has it's lowmemorykiller (lmkd), desktops has earlyoom.
>> This watchdog can be used to help monitor to preform some basic
>> action to keep the monitor running.
>>
>> It can also be used standalone. This add a policy that is
>> killing the process with highest oom_score_adj and using
>> oom functions to it quickly. I think it is a good usecase
>> for the patch. Memory siuations can be problematic for
>> software that monitor system, but other prolicys can
>> should also be possible. Like picking tasks from a memcg, or
>> specific UID's or what ever is low priority.
>> ---
>
> NACK. Besides this not following the new watchdog API, the task
> of a watchdog is to reset the system on failure. Its task is most
> definitely not to re-implement the oom killer in any way, shape,
> or form.
>

I don't think this proposal is a watchdog. I think this proposal is
a timer based process killer, based on an assumption that any slowdown
which prevents the monitor process from pinging for more than 0.5 seconds
(if HZ == 1000) is caused by memory pressure.