Re: [PATCH v2 2/2] x86/cfi,bpf: Fix BPF JIT call

From: Peter Zijlstra
Date: Fri Dec 08 2023 - 17:47:07 EST


On Fri, Dec 08, 2023 at 12:58:01PM -0800, Alexei Starovoitov wrote:
> On Fri, Dec 8, 2023 at 12:52 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Fri, Dec 08, 2023 at 12:41:03PM -0800, Alexei Starovoitov wrote:
> > > On Fri, Dec 8, 2023 at 12:35 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > > > -__bpf_kfunc void bpf_task_release(struct task_struct *p)
> > > > +__bpf_kfunc void bpf_task_release(void *p)
> > >
> > > Yeah. That won't work. We need a wrapper.
> > > Since bpf prog is also calling it directly.
> > > In progs/task_kfunc_common.h
> > > void bpf_task_release(struct task_struct *p) __ksym;
> > >
> > > than later both libbpf and the verifier check that
> > > what bpf prog is calling actually matches the proto
> > > of what is in the kernel.
> > > Effectively we're doing strong prototype check at load time.
> >
> > I'm still somewhat confused on how this works, where does BPF get the
> > address of the function from? and what should I call the wrapper?
>
> It starts with
> register_btf_id_dtor_kfuncs() that takes a set of btf_ids:
> {btf_id_of_type, btf_id_of_dtor_function}, ...
>
> Then based on btf_id_of_dtor_function we find its type proto, name, do checks,
> and eventually:
> addr = kallsyms_lookup_name(dtor_func_name);
> field->kptr.dtor = (void *)addr;
>
> bpf_task_release(struct task_struct *p) would need to stay as-is,
> but we can have a wrapper
> void bpf_task_release_dtor(void *p)
> {
> bpf_task_release(p);
> }
>
> And adjust the above lookup with extra "_dtor" suffix.
>
> > > btw instead of EXPORT_SYMBOL_GPL(bpf_task_release)
> > > can __ADDRESSABLE be used ?
> > > Since it's not an export symbol.
> >
> > No __ADDRESSABLE() is expressly ignored, but we have IBT_NOSEAL() that
> > should do it. I'll rename the thing and lift it out of x86 to avoid
> > breaking all other arch builds.
>
> Makes sense.

Ok, did that. Current patches (on top of bpf-next) are here:

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/cfi

(really should try and write better changelogs, but it's too late)

The test_progs thing still doesn't run to completion, the next problem
seems to be bpf_throw():

[ 247.720159] ? die+0xa4/0xd0
[ 247.720216] ? do_trap+0xa5/0x180
[ 247.720281] ? __cfi_bpf_prog_8ac473954ac6d431_F+0xd/0x10
[ 247.720368] ? __cfi_bpf_prog_8ac473954ac6d431_F+0xd/0x10
[ 247.720459] ? do_error_trap+0xba/0x120
[ 247.720525] ? __cfi_bpf_prog_8ac473954ac6d431_F+0xd/0x10
[ 247.720614] ? handle_invalid_op+0x2c/0x40
[ 247.720684] ? __cfi_bpf_prog_8ac473954ac6d431_F+0xd/0x10
[ 247.720775] ? exc_invalid_op+0x38/0x60
[ 247.720840] ? asm_exc_invalid_op+0x1a/0x20
[ 247.720909] ? 0xffffffffc001ba54
[ 247.720971] ? __cfi_bpf_prog_8ac473954ac6d431_F+0xd/0x10
[ 247.721063] ? bpf_throw+0x9b/0xf0
[ 247.721126] ? bpf_test_run+0x108/0x350
[ 247.721191] ? bpf_prog_5555714b685bf0cf_exception_throw_always_1+0x26/0x26
[ 247.721301] ? bpf_test_run+0x108/0x350
[ 247.721368] bpf_test_run+0x212/0x350
[ 247.721433] ? slab_build_skb+0x22/0x110
[ 247.721503] bpf_prog_test_run_skb+0x347/0x4a0

But I'm too tired to think staight. Is this a bpf_callback_t vs
bpf_exception_cb difference?

I'll prod more later. Zzzz..