Re: [PATCH bpf-next 5/6] bpf: Improve tracing recursion prevention mechanism

From: Yafang Shao
Date: Thu Apr 27 2023 - 08:36:26 EST


On Thu, Apr 27, 2023 at 8:15 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
>
> On Thu, Apr 27, 2023 at 5:57 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> >
> > On Tue, Apr 25, 2023 at 5:40 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > On Wed, 19 Apr 2023 15:46:34 -0700
> > > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > > No. Just one prog at entry into any of the kernel functions
> > > > and another prog at entry of funcs that 1st bpf prog called indirectly.
> > > > Like one prog is tracing networking events while another
> > > > is focusing on mm. They should not conflict.
> > >
> > > You mean that you have:
> > >
> > > function start:
> > > __bpf_prog_enter_recur()
> > > bpf_program1()
> > > __bpf_prog_enter_recur()
> > > bpf_program2();
> > > __bpf_prog_exit_recur()
> > > __bpf_prog_exit_recur()
> > >
> > > rest of function
> > >
> > > That is, a bpf program can be called within another bpf pogram between
> > > the prog_enter and prog_exit(), that is in the same context (normal,
> > > softirq, irq, etc)?
> > >
> >
> > Right, that can happen per my verification. Below is a simple bpf
> > program to verify it.
> >
> > struct {
> > __uint(type, BPF_MAP_TYPE_LPM_TRIE);
> > __type(key, __u64);
> > __type(value, __u64);
> > __uint(max_entries, 1024);
> > __uint(map_flags, BPF_F_NO_PREALLOC);
> > } write_map SEC(".maps");
> >
> > __u64 key;
> >
> > SEC("fentry/kernel_clone")
> > int program1()
> > {
> > __u64 value = 1;
> >
> > bpf_printk("before update");
> > // It will call trie_update_elem and thus trigger program2.
> > bpf_map_update_elem(&write_map, &key, &value, BPF_ANY);
> > __sync_fetch_and_add(&key, 1);
> > bpf_printk("after update");
> > return 0;
> > }
> >
> > SEC("fentry/trie_update_elem")
> > int program2()
> > {
> > bpf_printk("trie_update_elem");
> > return 0;
> > }
> >
> > The result as follows,
> >
> > kubelet-203203 [018] ....1 9579.862862:
> > __bpf_prog_enter_recur: __bpf_prog_enter_recur
> > kubelet-203203 [018] ...11 9579.862869: bpf_trace_printk:
> > before update
> > kubelet-203203 [018] ....2 9579.862869:
> > __bpf_prog_enter_recur: __bpf_prog_enter_recur
> > kubelet-203203 [018] ...12 9579.862870: bpf_trace_printk:
> > trie_update_elem
> > kubelet-203203 [018] ....2 9579.862870:
> > __bpf_prog_exit_recur: __bpf_prog_exit_recur
> > kubelet-203203 [018] ...11 9579.862870: bpf_trace_printk:
> > after update
> > kubelet-203203 [018] ....1 9579.862871:
> > __bpf_prog_exit_recur: __bpf_prog_exit_recur
> >
> > Note that we can't trace __bpf_prog_enter_recur and
> > __bpf_prog_exit_recur, so we have to modify the kernel to print them.
> >
>
> ... However, surprisingly it still works even after this patchset is
> applied, because the hardirq/softirq flag is set when the program2 is
> running, see also the flags in the above trace_pipe output. Is that
> expected ?!
> I need some time to figure it out, but maybe you have a quick answer...

Answer it by myself, that is because of the
allowing-one-single-recursion rule. I misread the trace flags before.
Sorry about the noise.


--
Regards
Yafang