[RFC] How to test panic handlers, without crashing the kernel

From: Jocelyn Falempe
Date: Fri Mar 01 2024 - 06:15:40 EST


Hi,

While writing a panic handler for drm devices [1], I needed a way to test it without crashing the machine.
So from debugfs, I called atomic_notifier_call_chain(&panic_notifier_list, ...), but it has the side effect of calling all other panic notifiers registered.

So Sima suggested to move that to the generic panic code, and test all panic notifiers with a dedicated debugfs interface.

I can move that code to kernel/, but before doing that, I would like to know if you think that's the right way to test the panic code.


The second question is how to simulate a panic context in a non-destructive way, so we can test the panic notifiers in CI, without crashing the machine. The worst case for a panic notifier, is when the panic occurs in NMI context, but I don't know how to simulate that. The goal would be to find early if a panic notifier tries to sleep, or do other things that are not allowed in a panic context.


Best regards,

--

Jocelyn

[1] https://patchwork.freedesktop.org/patch/580183/?series=122244&rev=8