Re: [PATCH v1] fs: Add kfuncs to handle idmapped mounts

From: Christian Brauner
Date: Thu Jul 06 2023 - 03:24:50 EST


On Wed, Jul 05, 2023 at 06:10:32PM -0700, Alexei Starovoitov wrote:
> On Tue, Jul 4, 2023 at 6:01 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> >
> > > > +/**
> > > > + * bpf_is_idmapped_mnt - check whether a mount is idmapped
> > > > + * @mnt: the mount to check
> > > > + *
> > > > + * Return: true if mount is mapped, false if not.
> > > > + */
> > > > +__bpf_kfunc bool bpf_is_idmapped_mnt(struct vfsmount *mnt)
> > > > +{
> > > > + return is_idmapped_mnt(mnt);
> > > > +}
> ...
> >
> > I don't want any of these helpers as kfuncs as they are peeking deeply
> > into implementation details that we reserve to change. Specifically in
> > the light of:
> >
> > 3. kfunc lifecycle expectations part b):
> >
> > "Unlike with regular kernel symbols, this is expected behavior for BPF
> > symbols, and out-of-tree BPF programs that use kfuncs should be considered
> > relevant to discussions and decisions around modifying and removing those
> > kfuncs. The BPF community will take an active role in participating in
> > upstream discussions when necessary to ensure that the perspectives of such
> > users are taken into account."
> >
> > That's too much stability for my taste for these helpers. The helpers
> > here exposed have been modified multiple times and once we wean off
> > idmapped mounts from user namespaces completely they will change again.
> > So I'm fine if they're traceable but not as kfuncs with any - even
> > minimal - stability guarantees.
>
> Christian,
> That quote is taken out of context.
> In the first place the Documentation/bpf/kfuncs.rst says:
> "
> kfuncs provide a kernel <-> kernel API, and thus are not bound by any of the
> strict stability restrictions associated with kernel <-> user UAPIs. This means
> they can be thought of as similar to EXPORT_SYMBOL_GPL, and can therefore be
> modified or removed by a maintainer of the subsystem they're defined in when
> it's deemed necessary.
> "
>
> bpf_get_file_vfs_ids is vfs related, so you guys decide when and how
> to add/remove them. It's ok that you don't want this particular one
> for whatever reason, but that reason shouldn't be 'stability guarantees'.
> There are really none. The kernel kfuncs can change at any time.
> There are plenty of examples in git log where we added and then
> tweaked/removed kfuncs.
>
> The doc also says:
> "
> As described above, while sometimes a maintainer may find that a kfunc must be
> changed or removed immediately to accommodate some changes in their subsystem,
> "
> and git log of such cases proves the point.
>
> The quote about out-of-tree bpf progs is necessary today, since
> very few bpf progs are in-tree, so when maintainers of a subsystem
> want to remove kfunc the program authors need something in the doc
> to point to and explain why and how they use the kfunc otherwise
> maintainers will just say 'go away. you're out-of-tree'.
> The users need their voice to be heard. Even if the result is the same.
> In other words the part you quoted is needed to make kfuncs usable.
> Otherwise 'kfunc is 100% unstable and maintainers can rename it
> every release just to make life of bpf prog writers harder'
> becomes a real possibility in the minds of bpf users.
> The kfunc doc makes it 100% clear that there are no stability guarantees.
> So please don't say 'minimal stability'.
>
> In your other reply:
>
> > we can look at the in-kernel users of is_idmapped_mnt(),
> > convert them and then kill this thing off if we wanted to.
>
> you can absolutely do that even if is_idmapped_mnt() is exposed as a kfunc.
> You'll just delete it with zero notice if you like.
> Just like what you would do with a normal export_symbol.
> The doc is pretty clear about it and there are examples where we did
> such things.

I think I said it somewhere else: I'm not opposing your position on
kfruncs in a sense I understand that's kinda the model that you have to
push for. But you have to admit that this out-of-tree portion is very
hard to swallow if you've been hit by out of tree modules and their
complaints about removed EXPORT_SYMBOL*()s.

I'm still rather hesitant about this because I find it hard to figure
out how this will go down in practice. But, especially with the internal
idmapped mount api. This is a very hidden and abstracted away
implementation around an opaque type and I'm not yet ready to let
modules or bpf programs peek into it's implementation details. I hope
that's understandable.