Re: [PATCH v3] hardlockup: detect hard lockups using secondary (buddy) CPUs

From: Doug Anderson
Date: Thu May 04 2023 - 18:39:25 EST


Hi,

On Mon, May 1, 2023 at 8:25 AM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote:
>
> From: Colin Cross <ccross@xxxxxxxxxxx>
>
> Implement a hardlockup detector that doesn't doesn't need any extra
> arch-specific support code to detect lockups. Instead of using
> something arch-specific we will use the buddy system, where each CPU
> watches out for another one. Specifically, each CPU will use its
> softlockup hrtimer to check that the next CPU is processing hrtimer
> interrupts by verifying that a counter is increasing.
>
> NOTE: unlike the other hard lockup detectors, the buddy one can't
> easily show what's happening on the CPU that locked up just by doing a
> simple backtrace. It relies on some other mechanism in the system to
> get information about the locked up CPUs. This could be support for
> NMI backtraces like [1], it could be a mechanism for printing the PC
> of locked CPUs at panic time like [2] / [3], or it could be something
> else. Even though that means we still rely on arch-specific code, this
> arch-specific code seems to often be implemented even on architectures
> that don't have a hardlockup detector.
>
> This style of hardlockup detector originated in some downstream
> Android trees and has been rebased on / carried in ChromeOS trees for
> quite a long time for use on arm and arm64 boards. Historically on
> these boards we've leveraged mechanism [2] / [3] to get information
> about hung CPUs, but we could move to [1].
>
> Although the original motivation for the buddy system was for use on
> systems without an arch-specific hardlockup detector, it can still be
> useful to use even on systems that _do_ have an arch-specific
> hardlockup detector. On x86, for instance, there is a 24-part patch
> series [4] in progress switching the arch-specific hard lockup
> detector from a scarce perf counter to a less-scarce hardware
> resource. Potentially the buddy system could be a simpler alternative
> to free up the perf counter but still get hard lockup detection.
>
> Overall, pros (+) and cons (-) of the buddy system compared to an
> arch-specific hardlockup detector:
> + The buddy system is usable on systems that don't have an
> arch-specific hardlockup detector, like arm32 and arm64 (though it's
> being worked on for arm64 [5]).
> + The buddy system may free up scarce hardware resources.
> + If a CPU totally goes out to lunch (can't process NMIs) the buddy
> system could still detect the problem (though it would be unlikely
> to be able to get a stack trace).
> + The buddy system uses the same timer function to pet the hardlockup
> detector on the running CPU as it uses to detect hardlockups on
> other CPUs. Compared to other hardlockup detectors, this means it
> generates fewer interrupts and thus is likely better able to let
> CPUs stay idle longer.
> - If all CPUs are hard locked up at the same time the buddy system
> can't detect it.
> - If we don't have SMP we can't use the buddy system.
> - The buddy system needs an arch-specific mechanism (possibly NMI
> backtrace) to get info about the locked up CPU.
>
> [1] https://lore.kernel.org/r/20230419225604.21204-1-dianders@xxxxxxxxxxxx
> [2] https://issuetracker.google.com/172213129
> [3] https://docs.kernel.org/trace/coresight/coresight-cpu-debug.html
> [4] https://lore.kernel.org/lkml/20230301234753.28582-1-ricardo.neri-calderon@xxxxxxxxxxxxxxx/
> [5] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@xxxxxxxxxxxx/
>
> Signed-off-by: Colin Cross <ccross@xxxxxxxxxxx>
> Signed-off-by: Matthias Kaehlcke <mka@xxxxxxxxxxxx>
> Signed-off-by: Guenter Roeck <groeck@xxxxxxxxxxxx>
> Signed-off-by: Tzung-Bi Shih <tzungbi@xxxxxxxxxxxx>
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> ---
> This patch has been rebased in ChromeOS kernel trees many times, and
> each time someone had to do work on it they added their
> Signed-off-by. I've included those here. I've also left the author as
> Colin Cross since the core code is still his.
>
> I'll also note that the CC list is pretty giant, but that's what
> get_maintainers came up with (plus a few other folks I thought would
> be interested). As far as I can tell, there's no true MAINTAINER
> listed for the existing watchdog code. Assuming people don't hate
> this, maybe it would go through Andrew Morton's tree?
>
> Changes in v3:
> - More cpu => CPU (in Kconfig and comments).
> - Added a note in commit message about the effect on idle.
> - Cleaned up commit message pros/cons to be complete sentences.
> - No code changes other than comments.
>
> Changes in v2:
> - cpu => CPU (in commit message).
> - Reworked description and Kconfig based on v1 discussion.
> - No code changes.
>
> include/linux/nmi.h | 18 ++++-
> kernel/Makefile | 1 +
> kernel/watchdog.c | 24 ++++--
> kernel/watchdog_buddy_cpu.c | 141 ++++++++++++++++++++++++++++++++++++
> lib/Kconfig.debug | 23 +++++-
> 5 files changed, 196 insertions(+), 11 deletions(-)

To leave breadcrumbs: I've posted v4 which is now a big series

https://lore.kernel.org/r/20230504221349.1535669-1-dianders@xxxxxxxxxxxx

I took some people off the CC list that get_maintainers had added on
v3, mostly because it was getting unbearable. I tried to copy all
relevant mailing lists, so hopefully anyone who needs v4 can find it
somewhere where it's easy for them to reply to. If you got dropped off
the CC list and want back on for future versions, please yell and I'll
add you. Unless I messed up, I've CCed anyone who replied to previous
versions.

-Doug