Re: [PATCH bpf-next] bpf: Support default .validate() and .update() behavior for struct_ops links

From: Stanislav Fomichev
Date: Thu Aug 10 2023 - 19:15:15 EST


On 08/10, David Vernet wrote:
> On Thu, Aug 10, 2023 at 03:46:18PM -0700, Stanislav Fomichev wrote:
> > On 08/10, David Vernet wrote:
> > > Currently, if a struct_ops map is loaded with BPF_F_LINK, it must also
> > > define the .validate() and .update() callbacks in its corresponding
> > > struct bpf_struct_ops in the kernel. Enabling struct_ops link is useful
> > > in its own right to ensure that the map is unloaded if an application
> > > crashes. For example, with sched_ext, we want to automatically unload
> > > the host-wide scheduler if the application crashes. We would likely
> > > never support updating elements of a sched_ext struct_ops map, so we'd
> > > have to implement these callbacks showing that they _can't_ support
> > > element updates just to benefit from the basic lifetime management of
> > > struct_ops links.
> > >
> > > Let's enable struct_ops maps to work with BPF_F_LINK even if they
> > > haven't defined these callbacks, by assuming that a struct_ops map
> > > element cannot be updated by default.
> >
> > Any reason this is not part of sched_ext series? As you mention,
> > we don't seem to have such users in the three?
>
> Hi Stanislav,
>
> The sched_ext series [0] implements these callbacks. See
> bpf_scx_update() and bpf_scx_validate().
>
> [0]: https://lore.kernel.org/all/20230711011412.100319-13-tj@xxxxxxxxxx/
>
> We could add this into that series and remove those callbacks, but this
> patch is fixing a UX / API issue with struct_ops links that's not really
> relevant to sched_ext. I don't think there's any reason to couple
> updating struct_ops map elements with allowing the kernel to manage the
> lifetime of struct_ops maps -- just because we only have 1 (non-test)
> struct_ops implementation in-tree doesn't mean we shouldn't improve APIs
> where it makes sense.
>
> Thanks,
> David

Ack. I guess up to you and Martin. Just trying to understand whether I'm
missing something or the patch does indeed fix some use-case :-)