Re: [PATCH] bpf: reject blacklisted symbols in kprobe_multi to avoid recursive trap

From: Ze Gao
Date: Sat May 13 2023 - 05:19:51 EST


Exactly, and rethook_trampoline_handler suffers the same problem.

And I've posted two patches for kprobe and rethook by using the
notrace verison of preempt_
{disable, enable} to fix fprobe+rethook.
[1] https://lore.kernel.org/all/20230513081656.375846-1-zegao@xxxxxxxxxxx/T/#u
[2] https://lore.kernel.org/all/20230513090548.376522-1-zegao@xxxxxxxxxxx/T/#u

Even worse, bpf callback introduces more such use cases, which is
typically organized as follows
to guard the lifetime of bpf related resources ( per-cpu access or trampoline).

migrate_disable()
rcu_read_lock()
...
bpf_prog_run()
...
rcu_read_unlock()
migrate_enable().

But this may need to introduce fprobe_blacklist and
bpf_kprobe_blacklist to solve such bugs at all,
just like what Jiri and Yonghong suggested. Since bpf kprobe works on
a different (higher and
constrained) level than fprobe and ftrace and we cannot blindly mark
functions (migrate_disable,
__rcu_read_lock, etc.) used in tracer callbacks from external
subsystems in case of semantic breakage.
And I will try to implement these ideas later.

Thanks,
Ze

On Sat, May 13, 2023 at 12:18 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Fri, 12 May 2023 07:29:02 -0700
> Yonghong Song <yhs@xxxxxxxx> wrote:
>
> > A fprobe_blacklist might make sense indeed as fprobe and kprobe are
> > quite different... Thanks for working on this.
>
> Hmm, I think I see the problem:
>
> fprobe_kprobe_handler() {
> kprobe_busy_begin() {
> preempt_disable() {
> preempt_count_add() { <-- trace
> fprobe_kprobe_handler() {
> [ wash, rinse, repeat, CRASH!!! ]
>
> Either the kprobe_busy_begin() needs to use preempt_disable_notrace()
> versions, or fprobe_kprobe_handle() needs a
> ftrace_test_recursion_trylock() call.
>
> -- Steve