Re: [PATCH 2/4] watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work

From: Peter Zijlstra
Date: Tue Jun 12 2018 - 08:17:42 EST



On Fri, Jun 08, 2018 at 03:57:04PM +0200, Oleg Nesterov wrote:

> And probably there is another problem. Both watchdog_disable(cpu) and
> watchdog_nmi_disable(cpu) assume that cpu == smp_processor_id(), this arg
> is simply ignored.
>
> but lockup_detector_offline_cpu(cpu) is called by cpuhp_invoke_callback(),
> so in this case watchdog_disable(dying_cpu) is simply wrong.

But at this point, the cpuhp_invoke_callback() is ran from the dying CPU
still, so dying_cpu == this_cpu. I actually have a WARN in both
watchdog_{dis,en}able() to verify this assumption.

> May be we can do something like below? Then softlockup_stop_all() can simply do
>
> for_each_cpu(cpu, &watchdog_allowed_mask)
> watchdog_disable(cpu);
>
> watchdog_nmi_disable() is __weak, but at first glance arch/sparc/kernel/nmi.c
> does everything correctly.

I prefer to not do that and keep the current assumption. While it would
work for the disable, it the above form will not work for enable (we
really must start hrtimers on the right CPU) and that would bring some
asymmetry.