Re: [PATCH] lockdep: Panic on warning if panic_on_warn is set

From: Boqun Feng
Date: Sat Aug 20 2022 - 01:19:12 EST


On Fri, Aug 19, 2022 at 12:59:56PM +0200, Vincent Whitchurch wrote:
> On Thu, Aug 18, 2022 at 11:49:17PM +0200, Boqun Feng wrote:
> > On Thu, Aug 18, 2022 at 01:42:58PM +0200, Vincent Whitchurch wrote:
> > > There does not seem to be any way to get the system to panic if a
> > > lockdep warning is emitted, since those warnings don't use the normal
> > > WARN() infrastructure. Panicking on any lockdep warning can be
> > > desirable when the kernel is being run in a controlled environment
> > > solely for the purpose of testing. Make lockdep respect panic_on_warn
> > > to allow this, similar to KASAN and others.
> > >
> >
> > I'm not completely against this, but could you explain why you want to
> > panic on lockdep warning? I assume you want to have a kdump so that you
> > can understand the lock bugs closely? But lockdep discovers lock issue
> > possiblity, so it's not an after-the-fact detector. In other words, when
> > lockdep warns, the deadlock cases don't happen in the meanwhile. And
> > also lockdep tries very hard to print useful information to locate the
> > issues.
>
> I'm not trying to obtain a kdump in this case. I test device drivers
> under UML[0] and I want to make the tests stop and fail immediately if
> the driver triggers any kind of problem which results in splats in the
> log. I achieve this using panic_on_warn, panic_on_taint, and oops=panic
> which result in a panic and an error exit code from UML.
>
> [0] https://lore.kernel.org/lkml/20220311162445.346685-1-vincent.whitchurch@xxxxxxxx/
>
> For lockdep, without this patch, I would be forced to parse the logs
> after each test to determine if the test trigger a lockdep splat or not.
>

In that case, would a standard line with every lockdep warning help? For
example:

[...] A LOCKDEP issue detected.

Two reasons I don't think making lockdep warning as panic is a good
idea:

* We don't know what other CIs expect, given today lockdep doesn't panic
with panic_on_warn, this patch is a change of behaviors to them, and
it may break their setups/scripts.

* As I said, lockdep warnings are different than other warnings, and
panicking doesn't provide more information for debugging.

So I think an extra line helping scripts to parse may be better.

Work for you?

Regards,
Boqun

> > This patch add lockdep_panic() to a few places, and it's a pain for
> > maintaining. So why do you want to panic on lockdep warning?
>
> It's adding the call to a lot of places since there is no existing
> common function indicating the end of a lockdep warning. I can move the
> already duplicated dump_stack() calls into the new function too so that
> some code is removed. The "stack backtrace" could possible be
> consolidated too, but one of the call sites uses printk instead of
> pr_warn so I wasn't sure if it was OK to change that to a warn too.