Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

From: Zhouyi Zhou
Date: Tue Apr 25 2023 - 22:38:01 EST


On Wed, Apr 26, 2023 at 10:15 AM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi Zhouyi,
>
> On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote:
> [..]
> > Joel makes the learning process easier for me, indeed!
>
> I know that feeling being a learner myself ;-)
>
> > One question I have tried very hard to understand, but still confused.
> > for now, I know
> > r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed
> > to be equal to 3192(r10).
>
> First you have to I guess read up a bit about stack canaries. Google for
> "gcc stack protector" and "gcc stack canaries", and the look for basics of
> "buffer overflow attacks". That'll explain the concept of stack guards etc
> (Sorry if this is too obvious but I did not know how much you knew about it
> already).
>
> 40(r1) is where the canary was stored. In the beginning of the function, you
> have this:
>
> c000000000226d58: 78 0c 2d e9 ld r9,3192(r13)
> c000000000226d5c: 28 00 21 f9 std r9,40(r1)
>
> r1 is your stack pointer. 3192(r13) is the canary value.
>
> 40(r1) is where the canary is stored for later comparison.
>
> r1 should not change through out the function I believe, because otherwise
> you don't know where the stack frame is, right?
Thanks Joel's awesome explanation. I can understand the mechanics
behind our situation now!!
40(r1) is where the canary is stored for later comparison, this is
located on the stack.
while 3192(r13) is inside the cpu's PACA.
I quote Christophe's note here
"in order to be able to
have the canary as an offset of a fixed register as expected by GCC, we
copy the task canary into the cpu's PACA struct during _switch():
addi r6,r4,-THREAD /* Convert THREAD to 'current' */
std r6,PACACURRENT(r13) /* Set new 'current' */
#if defined(CONFIG_STACKPROTECTOR)
ld r6, TASK_CANARY(r6)
std r6, PACA_CANARY(r13)
#endif
"
>
> Later you have this stuff before the function returns which gcc presumably
> did due to optimization. That mr means move register and is where the caching
> of r13 to r10 happens that Boqun pointed.
Thank Boqun and all others' wonderful debugging! Your work confirmed
my bug report ;-)
>
> c000000000226eb4: 78 6b aa 7d mr r10,r13
> [...]
> and then the canary comparison happens:
>
> c000000000226ed8: 28 00 21 e9 ld r9,40(r1)
> c000000000226edc: 78 0c 4a e9 ld r10,3192(r10)
3192(r13) is correct because "we copy the task canary into the cpu's
PACA struct during _switch():"
but 3192(r10) is not correct, because r10 is the old value of r13.
> c000000000226ee0: 79 52 29 7d xor. r9,r9,r10
> c000000000226ee4: 00 00 40 39 li r10,0
> c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0>
>
> So looks like for this to blow up, the preemption/migration has to happen
> precisely between the mr doing the caching, and the xor doing the comparison,
> since that's when the r10 will be stale.
Thank Joel and all others for your time ;-)
I benefit a lot, and am very glad to do more good work to the
community in return ;-)

Cheers
Zhouyi
>
> thanks,
>
> - Joel
>