Re: [PATCH] hardlockup: detect hard lockups using secondary (buddy) cpus

From: Andi Kleen
Date: Sun May 07 2023 - 13:12:49 EST



On Mon, Apr 24, 2023 at 08:23:59AM -0700, Doug Anderson wrote:
> HPET system seems to have a single CPU in charge of processing the
> main NMI and then that single CPU is in charge of checking all the
> others. If that single CPU goes out to lunch then the system couldn't
> detect hard lockups.
>
> In any case, I'm happy to let others debate about the HPET system. For
> now, I'll take my action items to be:

We don't really seem to make any progress on the HPET series, so even
if it is better in some way a series that is never merged is always
worse than one that is.

My experience is that cases where everything locks up are very rare.
I suspect as long as we cover the garden variety single CPU lockup case well
it is likely very diminishing returns to handle more complex cases. So whatever
gets the job done is fine.

Yes freeing the Perfmon resources is big advantage of either.

-Andi