Re: [PATCH] ARM:unwind:fix unwind abort for uleb128 case

From: Linus Walleij
Date: Tue Apr 11 2023 - 17:32:13 EST


On Fri, Apr 7, 2023 at 5:33 AM Haibo Li <haibo.li@xxxxxxxxxxxx> wrote:

> When unwind instruction is 0xb2,the subsequent instructions
> are uleb128 bytes.
> For now,it uses only the first uleb128 byte in code.
>
> For vsp increments of 0x204~0x400,use one uleb128 byte like below:
> 0xc06a00e4 <unwind_test_work>: 0x80b27fac
> Compact model index: 0
> 0xb2 0x7f vsp = vsp + 1024
> 0xac pop {r4, r5, r6, r7, r8, r14}
>
> For vsp increments larger than 0x400,use two uleb128 bytes like below:
> 0xc06a00e4 <unwind_test_work>: @0xc0cc9e0c
> Compact model index: 1
> 0xb2 0x81 0x01 vsp = vsp + 1032
> 0xac pop {r4, r5, r6, r7, r8, r14}
> The unwind works well since the decoded uleb128 byte is also 0x81.
>
> For vsp increments larger than 0x600,use two uleb128 bytes like below:
> 0xc06a00e4 <unwind_test_work>: @0xc0cc9e0c
> Compact model index: 1
> 0xb2 0x81 0x02 vsp = vsp + 1544
> 0xac pop {r4, r5, r6, r7, r8, r14}
> In this case,the decoded uleb128 result is 0x101(vsp=0x204+(0x101<<2)).
> While the uleb128 used in code is 0x81(vsp=0x204+(0x81<<2)).
> The unwind aborts at this frame since it gets incorrect vsp.
>
> To fix this,add uleb128 decode to cover all the above case.
>
> Signed-off-by: Haibo Li <haibo.li@xxxxxxxxxxxx>

[Added people such as Catalin, Ard and Anurag who wrote the lion's
share of actual algorithms in this file]

I would just link the wikipedia in the patch commit log actually:

Link: https://en.wikipedia.org/wiki/LEB128

for poor souls like me who need a primer on this encoding.

It's great if you also have a reference to the spec where you
found this, but I take your word for that this appears in code.
Did compilers always emit this? Then we should have a Cc stable
to this patch. Unfortunately the link in the top of the file is dead.

> +static unsigned long unwind_decode_uleb128(struct unwind_ctrl_block *ctrl)

So this decodes max an unsigned long? Are we sure that will always
suffice?

> +{
> + unsigned long result = 0;
> + unsigned long insn;
> + unsigned long bytes = 0;
> +
> + do {
> + insn = unwind_get_byte(ctrl);
> + result |= (insn & 0x7f) << (bytes * 7);
> + bytes++;
> + if (bytes == sizeof(result))
> + break;
> + } while (!!(insn & 0x80));

I suppose the documentation is in the commit message, but something terse
and nice that make us understand this code would be needed here as well.
Could you fold in a comment of how the do {} while-loop works and th expected
outcome? Something like:

"unwind_get_byte() will advance ctrl one instruction at a time, we loop
until we get an instruction byte where bit 7 is not set."

Is there a risk that this will loop forever or way too long if it happens
to point at some corrupted memory containing say 0xff 0xff 0xff ...?

Since we're decoding a 32 bit unsigned long maybe break the loop after max
5 bytes (35 bits)? Or are we sure this will not happen?

> @@ -361,7 +376,7 @@ static int unwind_exec_insn(struct unwind_ctrl_block *ctrl)
> if (ret)
> goto error;
> } else if (insn == 0xb2) {
> - unsigned long uleb128 = unwind_get_byte(ctrl);
> + unsigned long uleb128 = unwind_decode_uleb128(ctrl);

Is unsigned long always enough? We are sure?

Yours,
Linus Walleij