Re: [PATCH bpf-next 5/6] bpf: Improve tracing recursion prevention mechanism

From: Yafang Shao
Date: Thu Apr 27 2023 - 10:23:45 EST


On Thu, Apr 27, 2023 at 9:26 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Mon, 17 Apr 2023 15:47:36 +0000
> Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
>
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index f61d513..3df39a5 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -842,15 +842,21 @@ static __always_inline u64 notrace bpf_prog_start_time(void)
> > static u64 notrace __bpf_prog_enter_recur(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx)
> > __acquires(RCU)
>
> Because __bpf_prog_enter_recur() and __bpf_prog_exit_recur() can
> legitimately nest (as you pointed out later in the thread), I think my
> original plan is the way to go.
>
>
>
> > {
> > - rcu_read_lock();
> > - migrate_disable();
> > -
> > - run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx);
> > + int bit;
> >
> > - if (unlikely(this_cpu_inc_return(*(prog->active)) != 1)) {
> > + rcu_read_lock();
> > + bit = test_recursion_try_acquire(_THIS_IP_, _RET_IP_);
> > + run_ctx->recursion_bit = bit;
> > + if (bit < 0) {
> > + preempt_disable_notrace();
> > bpf_prog_inc_misses_counter(prog);
> > + preempt_enable_notrace();
> > return 0;
> > }
> > +
> > + migrate_disable();
>
> Just encompass the migrate_disable/enable() with the recursion protection.
>
> That is, here add:
>
> test_recursion_release(recursion_bit);
>
> No need to save it in the run_ctx, as you can use a local variable.
>
> As I mentioned, if it passes when checking migrate_disable() it will also
> pass when checking around migrate_enable() so the two will still be paired
> properly, even if only the migrate_enable() starts recursing.
>
>
> bit = test_recursion_try_acquire() // OK
> if (bit < 0)
> return;
> migrate_disable();
> test_recursion_release(bit);
>
> [..]
>
> bit = test_recursion_try_acquire() // OK
> migrate_enable() // traced and recurses...
>
> bit = test_recursion_try_acquire() // fails
> if (bit < 0)
> return; // returns here
> migrate_disable() // does not get called.
>
> The recursion around migrate_disable/enable() is needed because it's done
> before other checks. You can't attach the test_recursion logic to the
> __bpf_prog_enter/exit() routines, because those can legitimately recurse.
>

IIUC, the acquire/release pair works as follows,

test_recursion_try_acquire
[ protection area ]
test_recursion_release

After release, there will be no protection, and thus it will fail the
tools/testing/selftests/bpf/progs/recursion.c[1] test case, because
the recursion occurs in the bpf_prog_run() itself,

__bpf_prog_enter
test_recursion_try_acquire
[...]
test_recursion_release
// no protection after the release
bpf_prog_run()
bpf_prog_run() // the recursion can't be prevented.
__bpf_prog_enter
test_recursion_try_acquire
[...]
test_recursion_release
bpf_prog_run()
bpf_prog_run()
__bpf_prog_enter
test_recursion_try_acquire
[...]
test_recursion_release
bpf_prog_run()
[ And so on ... ]

[1]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/recursion.c#n38

--
Regards
Yafang