Re: [-next] Lockdep warnings

From: Mark Rutland
Date: Tue Jul 26 2022 - 08:41:03 EST


[Adding Peter; I suspect this is due to the cpuidle rework]

I'll go give next a spin in a VM, but I suspect I might need real HW to see
this due to the way PSCI idle states work.

Mark.

On Tue, Jul 26, 2022 at 11:41:34AM +0100, Sudeep Holla wrote:
> I was seeing the below lockdep warnings on my arm64 Juno development
> platform almost 2 weeks back with -next. I wanted to check for similar
> reports before post and forgot.
>
> --->8
>
> DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
> hardirqs last enabled at (46157): cpuidle_enter_state+0x174/0x2b4
> WARNING: CPU: 5 PID: 0 at kernel/locking/lockdep.c:5506 check_flags+0x90/0x1e8
> hardirqs last disabled at (46158): el1_interrupt+0x2c/0xc8
> Modules linked in:
> softirqs last enabled at (46154): __do_softirq+0x2c0/0x388
> softirqs last disabled at (46139): __irq_exit_rcu+0x118/0x18c
> CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.19.0-rc6-next-20220714 #9
> pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : check_flags+0x90/0x1e8
> lr : check_flags+0x90/0x1e8
> Call trace:
> check_flags+0x90/0x1e8
> lock_is_held_type+0x80/0x164
> rcu_read_lock_sched_held+0x40/0x7c
> trace_rcu_dyntick+0x5c/0x140
> ct_kernel_enter+0x78/0xd4
> ct_idle_exit+0x1c/0x44
> cpu_idle_poll+0x74/0xb8
> do_idle+0x90/0x2c4
> cpu_startup_entry+0x30/0x34
> secondary_start_kernel+0x130/0x144
> __secondary_switched+0xb0/0xb4
> irq event stamp: 64229
> hardirqs last enabled at (64229): cpu_idle_poll+0x40/0xb8
> hardirqs last disabled at (64228): do_idle+0xbc/0x2c4
> softirqs last enabled at (64190): __do_softirq+0x2c0/0x388
> softirqs last disabled at (64185): __irq_exit_rcu+0x118/0x18c
> ---[ end trace 0000000000000000 ]---
> possible reason: unannotated irqs-off.
> irq event stamp: 64229
> hardirqs last enabled at (64229): cpu_idle_poll+0x40/0xb8
> hardirqs last disabled at (64228): do_idle+0xbc/0x2c4
> softirqs last enabled at (64190): __do_softirq+0x2c0/0x388
> softirqs last disabled at (64185): __irq_exit_rcu+0x118/0x18c
>
> ----
>
> However I don't see the above warning with the latest -next. When I tried
> yesterday's -next now, I see a different warning. Not sure if they are
> related. I haven't tried to bisect.
>
> --->8
> =============================
> [ BUG: Invalid wait context ]
> 5.19.0-rc8-next-20220725 #38 Not tainted
> -----------------------------
> swapper/0/0 is trying to lock:
> (&drvdata->spinlock){....}-{3:3}, at: cti_cpu_pm_notify+0x54/0x114
> other info that might help us debug this:
> context-{5:5}
> 1 lock held by swapper/0/0:
> #0: (cpu_pm_notifier.lock){....}-{2:2}, at: cpu_pm_enter+0x2c/0x80
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc8-next-20220725-00004-g599e6691ed8c #38
> Call trace:
> dump_backtrace+0xe8/0x108
> show_stack+0x18/0x4c
> dump_stack_lvl+0x90/0xc8
> dump_stack+0x18/0x54
> __lock_acquire+0xa70/0x32d0
> lock_acquire+0x160/0x308
> _raw_spin_lock+0x60/0xa0
> cti_cpu_pm_notify+0x54/0x114
> raw_notifier_call_chain_robust+0x50/0xd4
> cpu_pm_enter+0x48/0x80
> psci_enter_idle_state+0x34/0x74
> cpuidle_enter_state+0x120/0x2a8
> cpuidle_enter+0x38/0x50
> do_idle+0x1e8/0x2b8
> cpu_startup_entry+0x24/0x28
> kernel_init+0x0/0x1a0
> start_kernel+0x0/0x470
> start_kernel+0x34c/0x470
> __primary_switched+0xbc/0xc4
>
> ----
>
> --
> Regards,
> Sudeep