[PATCH AUTOSEL 6.3 01/14] cpu/hotplug: Reset task stack state in _cpu_up()

From: Sasha Levin
Date: Sun Jul 02 2023 - 15:43:38 EST


From: David Woodhouse <dwmw@xxxxxxxxxxxx>

[ Upstream commit 6d712b9b3a58018259fb40ddd498d1f7dfa1f4ec ]

Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.

This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.

Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.

Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Mark Rutland <mark.rutland@xxxxxxx>
Tested-by: Mark Rutland <mark.rutland@xxxxxxx>
Tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
Tested-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
Tested-by: Helge Deller <deller@xxxxxx> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@xxxxxxxxxxxxx
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
kernel/cpu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6c0a92ca6bb59..43e0a77f21e81 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,12 +591,6 @@ static int bringup_cpu(unsigned int cpu)
struct task_struct *idle = idle_thread_get(cpu);
int ret;

- /*
- * Reset stale stack state from the last time this CPU was online.
- */
- scs_task_reset(idle);
- kasan_unpoison_task_stack(idle);
-
/*
* Some architectures have to walk the irq descriptors to
* setup the vector space for the cpu which comes online.
@@ -1383,6 +1377,12 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen, enum cpuhp_state target)
ret = PTR_ERR(idle);
goto out;
}
+
+ /*
+ * Reset stale stack state from the last time this CPU was online.
+ */
+ scs_task_reset(idle);
+ kasan_unpoison_task_stack(idle);
}

cpuhp_tasks_frozen = tasks_frozen;
--
2.39.2