Re: [PATCH RFC bpf-next 0/9] allow HID-BPF to do device IOs

From: Alexei Starovoitov
Date: Mon Feb 12 2024 - 16:24:36 EST


On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
<benjamin.tissoires@xxxxxxxxxx> wrote:
>
> On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >
> > Benjamin Tissoires <benjamin.tissoires@xxxxxxxxxx> writes:
> >
> > [...]
> > >> IIUC, the bpf_timer callback is just a function (subprog) from the
> > >> verifier PoV, so it is verified as whatever program type is creating the
> > >> timer. So in other words, as long as you setup the timer from inside a
> > >> tracing prog type, you should have access to all the same kfuncs, I
> > >> think?
> > >
> > > Yep, you are correct. But as mentioned above, I am now in trouble with
> > > the sleepable state:
> > > - I need to call timer_start() from a non sleepable tracing function
> > > (I'm in hard IRQ when dealing with a physical device)
> > > - but then, ideally, the callback function needs to be tagged as a
> > > sleepable one, so I can export my kfuncs which are doing kzalloc and
> > > device IO as such.
> > >
> > > However, I can not really teach the BPF verifier to do so:
> > > - it seems to check for the callback first when it is loaded, and
> > > there is no SEC() equivalent for static functions
> > > - libbpf doesn't have access to the callback as a prog as it has to be
> > > a static function, and thus isn't exported as a full-blown prog.
> > > - the verifier only checks for the callback when dealing with
> > > BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> > > (though the validation of the callback has already been done while
> > > checking it first, so we are already too late to change the sleppable
> > > state of the callback)
> > >
> > > Right now, the only OK-ish version I have is declaring the kfunc as
> > > non-sleepable, but checking that we are in a different context than
> > > the IRQ of the initial event. This way, I am not crashing if this
> > > function is called from the initial IRQ, but will still crash if used
> > > outside of the hid context.
> > >
> > > This is not satisfactory, but I feel like it's going to be hard to
> > > teach the verifier that the callback function is sleepable in that
> > > case (maybe we could suffix the callback name, like we do for
> > > arguments, but this is not very clean either).
> >
> > The callback is only set once when the timer is first setup; I *think*
> > it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
> > in the context you need (from a sleepable prog), but do the arming
> > (bpf_timer_start()) from a different program that is not itself sleepable?
> >
>
> Genius! It works, and I can just keep having them declared as a
> syscall kfunc, not as a tracing kfunc.
>
> But isn't this an issue outside of my use case? I mean, if the
> callback is assuming the environment for when it is set up but can be
> called from any context there seems to be a problem when 2 contexts
> are not equivalent, no?

I agree that workqueue delegation fits into the bpf_timer concept and
a lot of code can and should be shared.
All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
Too bad, bpf_timer_set_callback() doesn't have a flag argument,
so we need a new kfunc to set a sleepable callback.
Maybe
bpf_timer_set_sleepable_cb() ?
The verifier will set is_async_cb = true for it (like it does for regular cb-s).
And since prog->aux->sleepable is kinda "global" we need another
per subprog flag:
bool is_sleepable: 1;

We can factor out a check "if (prog->aux->sleepable)" into a helper
that will check that "global" flag and another env->cur_state->in_sleepable
flag that will work similar to active_rcu_lock.
Once the verifier starts processing subprog->is_sleepable
it will set cur_state->in_sleepable = true;
to make all subprogs called from that cb to be recognized as sleepable too.

A bit of a challenge is what to do with global subprogs,
since they're verified lazily. They can be called from
sleepable and non-sleepable contex. Should be solvable.

Overall I think this feature is needed urgently,
so if you don't have cycles to work on this soon,
I can prioritize it right after bpf_arena work.