Re: [PATCH] tracing: Add generic test_recursion_try_acquire()

From: Yafang Shao
Date: Sat Apr 15 2023 - 09:34:43 EST


On Sat, Apr 15, 2023 at 10:17 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
>
> The ftrace_test_recursion_trylock() also disables preemption. This is not
> required, but was a clean up as every place that called it also disabled
> preemption, and making the two tightly coupled appeared to make the code
> simpler.
>
> But the recursion protection can be used for other purposes that do not
> require disabling preemption. As the recursion bits are attached to the
> task_struct, it follows the task, so there's no need for preemption being
> disabled.
>
> Add test_recursion_try_acquire/release() functions to be used generically,
> and separate it from being associated with ftrace. It also removes the
> "lock" name, as there is no lock happening. Keeping the "lock" for the
> ftrace version is better, as it at least differentiates that preemption is
> being disabled (hence, "locking the CPU").
>
> Link: https://lore.kernel.org/linux-trace-kernel/20230321020103.13494-1-laoar.shao@xxxxxxxxx/
>
> Cc: Yafang Shao <laoar.shao@xxxxxxxxx>
> Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>

Acked-by: Yafang Shao <laoar.shao@xxxxxxxxx>

+Alexei, bpf

Thanks Steven.
I almost finished replacing prog->active with
test_recursion_try_{acquire,release}[1], and yet I need to do more
tests.

In my verification, I find that something abnormal happens. When I ran
bpf self tests after the replacement, I found the fentry recursion
test failed[2]. The test result as follows,

main:PASS:skel_open_and_load 0 nsec
main:PASS:skel_attach 0 nsec
main:PASS:pass1 == 0 0 nsec
main:PASS:pass1 == 1 0 nsec
main:PASS:pass1 == 2 0 nsec
main:PASS:pass2 == 0 0 nsec
main:FAIL:pass2 == 1 unexpected pass2 == 1: actual 2 != expected 1 [0]
main:FAIL:pass2 == 2 unexpected pass2 == 2: actual 4 != expected 2
main:PASS:get_prog_info 0 nsec
main:PASS:recursion_misses 0 nsec

The reason is that the bpf_map_delete_elem() in this fentry SEC()[2]
will hit the first if-condition[3] but fail to hit the second
if-condition[4], so it won't be considered as recursion at the first
time. So the value 'pass2' will be 2[0]. Illustrated as follows,

SEC("fentry/htab_map_delete_elem")
pass2++; <<<< Turn out to be 1 after this operation.
bpf_map_delete_elem(&hash2, &key);
SEC("fentry/htab_map_delete_elem") <<<< not recursion
pass2++; <<<< Turn out to be 2 after this operation.
bpf_map_delete_elem(&hash2, &key);
SEC("fentry/htab_map_delete_elem") <<<< RECURSION!!

That said, if a function can call itself, the first call won't be
considered as recursion. Is that expected ?

One thing I can be sure of is that prog->active can't handle the
interrupted case[5]. If the interrupt happens, it will be considered
as a recursion.
But it seems that bpf_map_delete_elem() in this fentry SEC() is still
in the process context.

[1]. https://lore.kernel.org/bpf/CALOAHbACzCwu-VeMczEJw8A4WFgkL-uQDS1NkcVR2pqEMZyAQQ@xxxxxxxxxxxxxx/
[2]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/recursion.c#n38
[3]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/trace_recursion.h#n166
[4]. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/trace_recursion.h#n176
[5]. https://lore.kernel.org/bpf/20230409220239.0fcf6738@xxxxxxxxxxxxxxxxxxxx/

> ---
> include/linux/trace_recursion.h | 47 ++++++++++++++++++++++++---------
> kernel/trace/ftrace.c | 2 ++
> 2 files changed, 37 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h
> index d48cd92d2364..80de2ee7b4c3 100644
> --- a/include/linux/trace_recursion.h
> +++ b/include/linux/trace_recursion.h
> @@ -150,9 +150,6 @@ extern void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip);
> # define trace_warn_on_no_rcu(ip) false
> #endif
>
> -/*
> - * Preemption is promised to be disabled when return bit >= 0.
> - */
> static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsigned long pip,
> int start)
> {
> @@ -182,18 +179,11 @@ static __always_inline int trace_test_and_set_recursion(unsigned long ip, unsign
> val |= 1 << bit;
> current->trace_recursion = val;
> barrier();
> -
> - preempt_disable_notrace();
> -
> return bit;
> }
>
> -/*
> - * Preemption will be enabled (if it was previously enabled).
> - */
> static __always_inline void trace_clear_recursion(int bit)
> {
> - preempt_enable_notrace();
> barrier();
> trace_recursion_clear(bit);
> }
> @@ -205,12 +195,18 @@ static __always_inline void trace_clear_recursion(int bit)
> * tracing recursed in the same context (normal vs interrupt),
> *
> * Returns: -1 if a recursion happened.
> - * >= 0 if no recursion.
> + * >= 0 if no recursion and preemption will be disabled.
> */
> static __always_inline int ftrace_test_recursion_trylock(unsigned long ip,
> unsigned long parent_ip)
> {
> - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START);
> + int bit;
> +
> + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START);
> + if (unlikely(bit < 0))
> + return bit;
> + preempt_disable_notrace();
> + return bit;
> }
>
> /**
> @@ -220,6 +216,33 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip,
> * This is used at the end of a ftrace callback.
> */
> static __always_inline void ftrace_test_recursion_unlock(int bit)
> +{
> + preempt_enable_notrace();
> + trace_clear_recursion(bit);
> +}
> +
> +/**
> + * test_recursion_try_acquire - tests for recursion in same context
> + *
> + * This will detect recursion of a function.
> + *
> + * Returns: -1 if a recursion happened.
> + * >= 0 if no recursion
> + */
> +static __always_inline int test_recursion_try_acquire(unsigned long ip,
> + unsigned long parent_ip)
> +{
> + return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START);
> +}
> +
> +/**
> + * test_recursion_release - called after a success of test_recursion_try_acquire()
> + * @bit: The return of a successful test_recursion_try_acquire()
> + *
> + * This releases the recursion lock taken by a non-negative return call
> + * by test_recursion_try_acquire().
> + */
> +static __always_inline void test_recursion_release(int bit)
> {
> trace_clear_recursion(bit);
> }
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index db8532a4d5c8..1b117ab057ed 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -7299,6 +7299,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
> if (bit < 0)
> return;
>
> + preempt_disable();
> do_for_each_ftrace_op(op, ftrace_ops_list) {
> /* Stub functions don't need to be called nor tested */
> if (op->flags & FTRACE_OPS_FL_STUB)
> @@ -7320,6 +7321,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
> }
> } while_for_each_ftrace_op(op);
> out:
> + preempt_enable();
> trace_clear_recursion(bit);
> }
>
> --
> 2.39.2
>


--
Regards
Yafang