Re: WARNING: possible circular locking dependency detected

From: Thomas Gleixner
Date: Tue Aug 29 2017 - 16:10:54 EST


On Tue, 29 Aug 2017, Peter Zijlstra wrote:
> On Tue, Aug 29, 2017 at 07:40:44PM +0200, Thomas Gleixner wrote:
>
> > One solution I'm looking into right now is to reverse the lock order and
> > actually make the hotplug code do:
> >
> > watchdog_lock();
> > cpu_write_lock();
> >
> > ....
> > cpu_write_unlock();
> > watchdog_unlock();
> >
> > and get rid of cpu_read_(un)lock() in the sysctl interface completely. I
> > know it's ugly, but we have other locks we take in the hotplug path as
> > well.
>
> This is to serialize the sysctl against hotplug? I'm not immediately
> seeing why watchdog_lock needs to be the outer most lock, is that
> because of vfs locks or something?

Well, the watchdog sysctls serialization today is:

cpus_read_lock();
mutex_lock(&watchdog_mutex);
do_stuff()
access -> online_cpu_mask

do_stuff()
...
...
cpus_read_lock();

So we need

watchdog_mutex -> cpuhotplug_rwsem

lock order all over the place.

> > Though it's quite a rewrite of that mess, which is particularly non trivial
> > because that extra non perf implementation in arch/powerpc which has its
> > own NMI watchdog thingy wants its calls preserved. But AFAICT so far it
> > should just work. Famous last words....
> >
> > Thoughts?
>
> So I have a patch _somewhere_ that preserves the event<->cpu relation
> across hotplug and disable/enable would be sufficient. If you want I can
> try and dig that out and make it work again.
>
> That would avoid having to do the destroy/create cycle of the watchdog
> events.

Yes, that would solve the x86_release_hw() issue, but still lots of the
other rework is required in one way or the other.

I'm currently trying to avoid that extra lock mess in the cpu hotplug code,
which would just open the door for everybody to add his extra locks there,
so we end up taking a gazillion locks before we can hotplug :)

I think I have an idea how to solve that cleanly, but certainly your offer
of preserving the event - cpu relation accross hotplug would help
tremendously.

Thanks,

tglx