Re: KASAN: use-after-free Read in ___bpf_prog_run

From: Yonghong Song
Date: Thu Jan 12 2023 - 02:00:10 EST




On 1/9/23 5:21 AM, Hao Sun wrote:


Yonghong Song <yhs@xxxxxxxx> 于2022年12月18日周日 00:57写道:



On 12/16/22 10:54 PM, Hao Sun wrote:


On 17 Dec 2022, at 1:07 PM, Yonghong Song <yhs@xxxxxxxx> wrote:



On 12/14/22 11:49 PM, Hao Sun wrote:
Hi,
The following KASAN report can be triggered by loading and test
running this simple BPF prog with a random data/ctx:
0: r0 = bpf_get_current_task_btf ;
R0_w=trusted_ptr_task_struct(off=0,imm=0)
1: r0 = *(u32 *)(r0 +8192) ;
R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: exit
I've simplified the C reproducer but didn't find the root cause.
JIT was disabled, and the interpreter triggered UAF when executing
the load insn. A slab-out-of-bound read can also be triggered:
https://pastebin.com/raw/g9zXr8jU
This can be reproduced on:
HEAD commit: b148c8b9b926 selftests/bpf: Add few corner cases to test
padding handling of btf_dump
git tree: bpf-next
console log: https://pastebin.com/raw/1EUi9tJe
kernel config: https://pastebin.com/raw/rgY3AJDZ
C reproducer: https://pastebin.com/raw/cfVGuCBm

I I tried with your above kernel config and C reproducer and cannot reproduce the kasan issue you reported.

[root@arch-fb-vm1 bpf-next]# ./a.out
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (85) call bpf_get_current_task_btf#158 ; R0_w=trusted_ptr_task_struct(off=0,imm=0)
1: (61) r0 = *(u32 *)(r0 +8192) ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
2: (95) exit
processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

prog fd: 3
[root@arch-fb-vm1 bpf-next]#

Your config indeed has kasan on.

Hi,

I can still reproduce this on a latest bpf-next build: 0e43662e61f25
(“tools/resolve_btfids: Use pkg-config to locate libelf”).
The simplified C reproducer sometime need to be run twice to trigger
the UAF. Also note that interpreter is required. Here is the original
C reproducer that loads and runs the BPF prog continuously for your
convenience:
https://pastebin.com/raw/WSJuNnVU


I still cannot reproduce with more than 10 runs. The config has jit off
so it already uses interpreter. It has kasan on as well.
# CONFIG_BPF_JIT is not set

Since you can reproduce it, I guess it would be great if you can
continue to debug this.


The load insn ‘r0 = *(u32*) (current + 8192)’ is OOB, because sizeof(task_struct)
is 7240 as shown in KASAN report. The issue is that struct task_struct is special,
its runtime size is actually smaller than it static type size. In X86:

task_struct->thread_struct->fpu->fpstate->union fpregs_state is
/*
* ...
* The size of the structure is determined by the largest
* member - which is the xsave area. The padding is there
* to ensure that statically-allocated task_structs (just
* the init_task today) have enough space.
*/
union fpregs_state {
struct fregs_state fsave;
struct fxregs_state fxsave;
struct swregs_state soft;
struct xregs_state xsave;
u8 __padding[PAGE_SIZE];
};

In btf_struct_access(), the resolved size for task_struct is 10496, much bigger
than its runtime size, so the prog in reproducer passed the verifier and leads
to the oob. This can happen to all similar types, whose runtime size is smaller
than its static size.

Not sure how many similar cases are there, maybe special check to task_struct
is enough. Any hint on how this should be addressed?

This should a corner case, I am not aware of other allocations like this.

For a normal program, if the access chain looks
like

task_struct->thread_struct->fpu->fpstate->fpregs_state->{fsave,fxsave, soft, xsave},
we should not hit this issue. So I think we don't need to address this
issue in kernel. The test itself should filter this out.