[RFC PATCH] x86, espfix: postpone the initialization of espfix stack for AP

From: Gu Zheng
Date: Fri May 22 2015 - 06:32:52 EST


The following lockdep warning occurs when running with 4.1.0-rc3:
[ 3.178000] ------------[ cut here ]------------
[ 3.183000] WARNING: CPU: 128 PID: 0 at kernel/locking/lockdep.c:2755 lockdep_trace_alloc+0xdd/0xe0()
[ 3.193000] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[ 3.199000] Modules linked in:

[ 3.203000] CPU: 128 PID: 0 Comm: swapper/128 Not tainted 4.1.0-rc3 #70
[ 3.221000] 0000000000000000 2d6601fb3e6d4e4c ffff88086fd5fc38 ffffffff81773f0a
[ 3.230000] 0000000000000000 ffff88086fd5fc90 ffff88086fd5fc78 ffffffff8108c85a
[ 3.238000] ffff88086fd60000 0000000000000092 ffff88086fd60000 00000000000000d0
[ 3.246000] Call Trace:
[ 3.249000] [<ffffffff81773f0a>] dump_stack+0x4c/0x65
[ 3.255000] [<ffffffff8108c85a>] warn_slowpath_common+0x8a/0xc0
[ 3.261000] [<ffffffff8108c8e5>] warn_slowpath_fmt+0x55/0x70
[ 3.268000] [<ffffffff810ee24d>] lockdep_trace_alloc+0xdd/0xe0
[ 3.274000] [<ffffffff811cda0d>] __alloc_pages_nodemask+0xad/0xca0
[ 3.281000] [<ffffffff810ec7ad>] ? __lock_acquire+0xf6d/0x1560
[ 3.288000] [<ffffffff81219c8a>] alloc_page_interleave+0x3a/0x90
[ 3.295000] [<ffffffff8121b32d>] alloc_pages_current+0x17d/0x1a0
[ 3.301000] [<ffffffff811c869e>] ? __get_free_pages+0xe/0x50
[ 3.308000] [<ffffffff811c869e>] __get_free_pages+0xe/0x50
[ 3.314000] [<ffffffff8102640b>] init_espfix_ap+0x17b/0x320
[ 3.320000] [<ffffffff8105c691>] start_secondary+0xf1/0x1f0
[ 3.327000] ---[ end trace 1b3327d9d6a1d62c ]---

This seems a mis-warning by lockdep, as we alloc pages with GFP_KERNEL in
init_espfix_ap() which is called before enabled local irq, and the lockdep
sub-system considers this behaviour as allocating memory with GFP_FS with
local irq disabled, then trigger the warning as mentioned about.

Though we could allocate them on the boot CPU side and hand them over to
the secondary CPU, but it seems a waste if some of cpus are still offline.
As there is no need to these pages(espfix stack) until we try to run user
code, so we can postpone the initialization of espfix stack after cpu
booted to avoid the noise.

Signed-off-by: Gu Zheng <guz.fnst@xxxxxxxxxxxxxx>
---
arch/x86/kernel/smpboot.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 50e547e..3ce05de 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -240,13 +240,6 @@ static void notrace start_secondary(void *unused)
check_tsc_sync_target();

/*
- * Enable the espfix hack for this CPU
- */
-#ifdef CONFIG_X86_ESPFIX64
- init_espfix_ap();
-#endif
-
- /*
* We need to hold vector_lock so there the set of online cpus
* does not change while we are assigning vectors to cpus. Holding
* this lock ensures we don't half assign or remove an irq from a cpu.
@@ -901,6 +894,13 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
}
}

+ /*
+ * Enable the espfix hack for this CPU
+ */
+#ifdef CONFIG_X86_ESPFIX64
+ init_espfix_ap();
+#endif
+
/* mark "stuck" area as not stuck */
*trampoline_status = 0;

--
1.8.3.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/