Re: [tip: locking/core] lockdep: Fix lockdep recursion

From: Peter Zijlstra
Date: Fri Oct 09 2020 - 12:24:28 EST


On Fri, Oct 09, 2020 at 06:58:37AM -0700, Paul E. McKenney wrote:
> On Fri, Oct 09, 2020 at 09:41:24AM -0400, Qian Cai wrote:
> > On Fri, 2020-10-09 at 07:58 +0000, tip-bot2 for Peter Zijlstra wrote:
> > > The following commit has been merged into the locking/core branch of tip:
> > >
> > > Commit-ID: 4d004099a668c41522242aa146a38cc4eb59cb1e
> > > Gitweb:
> > > https://git.kernel.org/tip/4d004099a668c41522242aa146a38cc4eb59cb1e
> > > Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > AuthorDate: Fri, 02 Oct 2020 11:04:21 +02:00
> > > Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> > > CommitterDate: Fri, 09 Oct 2020 08:53:30 +02:00
> > >
> > > lockdep: Fix lockdep recursion
> > >
> > > Steve reported that lockdep_assert*irq*(), when nested inside lockdep
> > > itself, will trigger a false-positive.
> > >
> > > One example is the stack-trace code, as called from inside lockdep,
> > > triggering tracing, which in turn calls RCU, which then uses
> > > lockdep_assert_irqs_disabled().
> > >
> > > Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu
> > > variables")
> > > Reported-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> >
> > Reverting this linux-next commit fixed booting RCU-list warnings everywhere.
>
> Is it possible that the RCU-list warnings were being wrongly suppressed
> without a21ee6055c30? As in are you certain that these RCU-list warnings
> are in fact false positives?

> > [ 4.002695][ T0] init_timer_key+0x29/0x220
> > [ 4.002695][ T0] identify_cpu+0xfcb/0x1980
> > [ 4.002695][ T0] identify_secondary_cpu+0x1d/0x190
> > [ 4.002695][ T0] smp_store_cpu_info+0x167/0x1f0
> > [ 4.002695][ T0] start_secondary+0x5b/0x290
> > [ 4.002695][ T0] secondary_startup_64_no_verify+0xb8/0xbb


They're actually correct warnings, this is trying to use RCU before that
CPU is reported to RCU.

Possibly something like the below works, but I've not tested it, nor
have I really thought hard about it, bring up tricky and this is just
moving code.

---

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 35ad8480c464..9173d64ee69d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1670,6 +1670,9 @@ void __init identify_boot_cpu(void)
void identify_secondary_cpu(struct cpuinfo_x86 *c)
{
BUG_ON(c == &boot_cpu_data);
+
+ rcu_cpu_starting(smp_processor_id());
+
identify_cpu(c);
#ifdef CONFIG_X86_32
enable_sep_cpu();
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
index 6a80f36b5d59..5f436cb4f7c4 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.c
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
@@ -794,8 +794,6 @@ void mtrr_ap_init(void)
if (!use_intel() || mtrr_aps_delayed_init)
return;

- rcu_cpu_starting(smp_processor_id());
-
/*
* Ideally we should hold mtrr_mutex here to avoid mtrr entries
* changed, but this routine will be called in cpu boot time,