Re: [PoC][PATCH] bpf: Call return value check function in the JITed code

From: Roberto Sassu
Date: Wed Nov 16 2022 - 11:46:29 EST


On Wed, 2022-11-16 at 08:16 -0800, Alexei Starovoitov wrote:
> On Wed, Nov 16, 2022 at 7:48 AM Roberto Sassu
> <roberto.sassu@xxxxxxxxxxxxxxx> wrote:
> > +static bool is_ret_value_allowed(int ret, u32 ret_flags)
> > +{
> > + if ((ret < 0 && !(ret_flags & LSM_RET_NEG)) ||
> > + (ret == 0 && !(ret_flags & LSM_RET_ZERO)) ||
> > + (ret == 1 && !(ret_flags & LSM_RET_ONE)) ||
> > + (ret > 1 && !(ret_flags & LSM_RET_GT_ONE)))
> > + return false;
> > +
> > + return true;
> > +}
> > +
> > /* For every LSM hook that allows attachment of BPF programs, declare a nop
> > * function where a BPF program can be attached.
> > */
> > @@ -30,6 +41,15 @@ noinline RET bpf_lsm_##NAME(__VA_ARGS__) \
> > #include <linux/lsm_hook_defs.h>
> > #undef LSM_HOOK
> >
> > +#define LSM_HOOK(RET, DEFAULT, RET_FLAGS, NAME, ...) \
> > +noinline RET bpf_lsm_##NAME##_ret(int ret) \
> > +{ \
> > + return is_ret_value_allowed(ret, RET_FLAGS) ? ret : DEFAULT; \
> > +}
> > +
> > +#include <linux/lsm_hook_defs.h>
> > +#undef LSM_HOOK
> > +
>
> because lsm hooks is mess of undocumented return values your
> "solution" is to add hundreds of noninline functions
> and hack the call into them in JITs ?!

I revisited the documentation and checked each LSM hook one by one.
Hopefully, I completed it correctly, but I would review again (others
are also welcome to do it).

Not sure if there is a more efficient way. Do you have any idea?
Maybe we find a way to use only one check function (by reusing the
address of the attachment point?).

Regarding the JIT approach, I didn't find a reliable solution for using
just the verifier. As I wrote to you, there could be the case where the
range can include positive values, despite the possible return values
are zero and -EACCES.

# ./test_progs-no_alu32 -t libbpf_get_fd

*reg = {type = SCALAR_VALUE, off = 0, {range = 0, {map_ptr = 0x0
<fixed_percpu_data>, map_uid = 0}, {btf = 0x0 <fixed_percpu_data>,
btf_id = 0}, mem_size = 0, dynptr = {type = BPF_DYNPTR_TYPE_INVALID,
first_slot = false}, raw = {raw1 = 0, raw2 = 0}, subprogno = 0}, id =
0,
ref_obj_id = 0, var_off = {value = 0, mask = 18446744073709551603},
smin_value = -9223372036854775808, smax_value = 9223372036854775795,
umin_value = 0, umax_value = 18446744073709551603, s32_min_value =
-2147483648, s32_max_value = 2147483635, u32_min_value = 0,
u32_max_value = 4294967283, parent = 0x0 <fixed_percpu_data>, frameno
= 0, subreg_def = 0, live = REG_LIVE_WRITTEN, precise = false}

The JIT approach instead is 100% reliable, as you check the real value
to be returned to BPF LSM.

But of course, the performance will be worse this way. If you are able
to determine at verification time that an eBPF program is not going to
return illegal values, that would be better. I'm not sure it is
feasible.

Thanks

Roberto