Re: Linux 5.19-rc8

From: Linus Torvalds
Date: Mon Jul 25 2022 - 13:55:41 EST


On Mon, Jul 25, 2022 at 9:11 AM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> BUG: KFENCE: out-of-bounds read in _find_next_bit_le+0x10/0x48

Ok, I was hoping somebody more ARMy would look at this, particularly
since there is no call trace beyond the actual fault.

So it shows that it happens in _find_next_bit_le(), but not who called it.

It does show "who allocated the page", and I can see the message that
is printed afterwards, so it comes from that

static void __init test_bitmap_printlist(void)

function, so I guess we know the call chain:

test_bitmap_printlist ->
bitmap_print_to_pagebuf ->
scnprintf "%*pbl\n" ->
pointer ->
bitmap_list_string ->
for_each_set_bitrange

and I think I see what's wrong in there. That thing does

(b) = find_next_bit((addr), (size), (e) + 1), \
(e) = find_next_zero_bit((addr), (size), (b) + 1))

for the end of the range, and looking at the oops, the instruction
that oopses is

ldrb r3, [r0, r2, lsr #3]

where 'r2' is the bit position, and 'r0' is the start of the bitmap.

And:

> r10: 00000000 r9 : 0000002d r8 : ef59d000
> r7 : c0e55514 r6 : c2215000 r5 : 00008000 r4 : 00008000
> r3 : 845cac12 r2 : 00008001 r1 : 00008000 r0 : ef59d000

Lookie here: r1 contains the size, and r2 is past the end of the size.

So pick your poison: either the bug is in

(a) the bitmap region iterators shouldn't even ask for past-the-end results

I've added Dennis Zhou who did that first
bitmap_for_each_set_region() in commit e837dfde15a4 ("bitmap:
genericize percpu bitmap region iterators"), and Yuri Norov who
renamed and moved it to for_each_set_bitrange() in commit ec288a2cf7ca
("bitmap: unify find_bit operations").

or

(b) the ARM find_next_bit() implementation, which doesn't check
whether the position is past the end

I've added Russell King (ARM stuff) and Catalin Marinas who
touched that last many many years ago in 8b592783a2e8 ("Thumb-2:
Implement the unified arch/arm/lib functions")

I think it's arguably a little bit of both, but mostly (b).

Note how the genetic find_next_bit() (and _find_next_bit()) does

if (unlikely(start >= nbits))
return nbits;

but the arm version of it does not.

I think the fix might be something like this:

diff --git a/arch/arm/lib/findbit.S b/arch/arm/lib/findbit.S
index b5e8b9ae4c7d..b36ca301892e 100644
--- a/arch/arm/lib/findbit.S
+++ b/arch/arm/lib/findbit.S
@@ -83,6 +83,8 @@ ENDPROC(_find_first_bit_le)
ENTRY(_find_next_bit_le)
teq r1, #0
beq 3b
+ cmp r2, r1
+ bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
ARM( ldrb r3, [r0, r2, lsr #3] )

but my ARM asm is so broken that the above is just really random noise
that may or may not build - much less work.

I'll leave it to Russell &co to have a tested and working patch.

Hmm?

Linus