Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

From: Christophe Leroy
Date: Tue Apr 25 2023 - 09:37:08 EST




Le 25/04/2023 à 13:53, Peter Zijlstra a écrit :
> On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote:
>>> I'm a little confused; the way I understand the whole stack protector
>>> thing to work is that we push a canary on the stack at call and on
>>> return check it is still valid. Since in general tasks randomly migrate,
>>> the per-cpu validation canary should be the same on all CPUs.
>
>> AFAICS, the canary is randomly chosen both in the kernel [1]. This
>
> Yes, at boot, once. But thereafter it should be the same for all CPUs.

Each task has its own canary, stored in task struct :

kernel/fork.c:1012: tsk->stack_canary = get_random_canary();

On PPC32 we have register 'r2' that points to task struct at all time,
so GCC is instructed to find canary at an offset from r2.

But on PPC64 we have no such register. Instead we have r13 that points
to the PACA struct which is a per-cpu structure, and we have a pointer
to 'current' task struct in the PACA struct. So in order to be able to
have the canary as an offset of a fixed register as expected by GCC, we
copy the task canary into the cpu's PACA struct during _switch():

addi r6,r4,-THREAD /* Convert THREAD to 'current' */
std r6,PACACURRENT(r13) /* Set new 'current' */
#if defined(CONFIG_STACKPROTECTOR)
ld r6, TASK_CANARY(r6)
std r6, PACA_CANARY(r13)
#endif

The problem is that r13 will change if a task is switched to another
CPU. But if GCC is using a copy of an older value of r13, then it will
take the canary from another CPU's PACA struct hence it'll get the
canary of the task running on that CPU instead of getting the canary of
the current task running on the current CPU.

Christophe