Re: Converting dev->mutex into dev->spinlock ?

From: Greg Kroah-Hartman
Date: Sat Feb 04 2023 - 09:34:34 EST


On Sat, Feb 04, 2023 at 11:21:27PM +0900, Tetsuo Handa wrote:
> On 2023/02/04 22:47, Greg Kroah-Hartman wrote:
> > On Sat, Feb 04, 2023 at 10:32:11PM +0900, Tetsuo Handa wrote:
> >> Hello.
> >>
> >> There is a long-standing deadlock problem in driver core code caused by
> >> "struct device"->mutex being marked as "do not apply lockdep checks".
> >
> > The marking of a lock does not cause a deadlock problem, so what do you
> > mean exactly by this? Where is the actual deadlock?
>
> A few of examples:
>
> https://syzkaller.appspot.com/bug?extid=2d6ac90723742279e101
> https://syzkaller.appspot.com/bug?extid=2e39bc6569d281acbcfb
> https://syzkaller.appspot.com/bug?extid=9ef743bba3a17c756174

Random links to syzkaller reports that are huge and not descriptive does
not actually persuade me as I don't have the inclination to dig through
them, sorry.

Specific examples, with code, please.

> >> We can make this deadlock visible by applying [1], and we can confirm that
> >> there is a deadlock problem that I think needs to be addressed in core code [2].
> >
> > Any reason why you didn't cc: us on these patches?
>
> We can't apply this "drivers/core: Remove lockdep_set_novalidate_class() usage" patch

What patch is that? I do not see that in my inbox anywhere. I don't
even see it in my lkml archive, so I do not know what you are talking
about.

> until we fix all lockdep warnings that happen during the boot stage;

What lockdep warnings?

> otherwise syzbot testing can't work which is more painful than
> applying this patch now.

Again, I'm totally confused. What is the real bug/problem/issue here?

Where is the deadlock?

> Therefore, I locally tested this patch (in order not to be applied now).

What patch? I'm totally confused.

> And I got a lockdep warning on the perf_event code.

What warning?

> I got next lockdep warning on the driver core code when I tried a fix
> for the perf_event code suggested by Peter Zijlstra.

Again, what warning?

> Since Peter confirmed that this is a problem that led to commit
> 1704f47b50b5 ("lockdep: Add novalidate class for dev->mutex
> conversion"), this time I'm reporting this problem to you (so that you
> can propose a fix for the driver core code).

Again, I have no idea what the real problem is!

Please show me in the driver core code, where the deadlock is that needs
to be resolved. Without that, I can't answer anything...

totally and throughly confused,

greg k-h