Re: [PATCH] tracefs/eventfs: Use root and instance inodes as default ownership

From: Steven Rostedt
Date: Wed Jan 10 2024 - 08:06:57 EST


On Wed, 10 Jan 2024 12:45:36 +0100
Christian Brauner <brauner@xxxxxxxxxx> wrote:

> So say you do:
>
> mkdir /sys/kernel/tracing/instances/foo
>
> After this has returned we know everything we need to know about the new
> tracefs instance including the ownership and the mode of all inodes in
> /sys/kernel/tracing/instances/foo/events/* and below precisely because
> ownership is always inherited from the parent dentry and recorded in the
> metadata struct eventfs_inode.
>
> So say someone does:
>
> open("/sys/kernel/tracing/instances/foo/events/xfs");
>
> and say this is the first time that someone accesses that events/
> directory.
>
> When the open pathwalk is done, the vfs will determine via
>
> [1] may_lookup(inode_of(events))
>
> whether you are able to list entries such as "xfs" in that directory.
> The vfs checks inode_permission(MAY_EXEC) on "events" and if that holds
> it ends up calling i_op->eventfs_root_lookup(events).
>
> At this point tracefs/eventfs adds the inodes for all entries in that
> "events" directory including "xfs" based on the metadata it recorded
> during the mkdir. Since now someone is actually interested in them. And
> it initializes the inodes with ownership and everything and adds the
> dentries that belong into that directory.
>
> Nothing here depends on the permissions of the caller. The only
> permission that mattered was done in the VFS in [1]. If the caller has
> permissions to enter a directory they can lookup and list its contents.
> And its contents where determined/fixed etc when mkdir was called.
>
> So we just need to add the required objects into the caches (inode,
> dentry) whose addition we intentionally defered until someone actually
> needed them.
>
> So, eventfs_root_lookup() now initializes the inodes with the ownership
> from the stored metadata or from the parent dentry and splices in inodes
> and dentries. No permission checking is needed for this because it is
> always a recheck of what the vfs did in [1].
>
> We now return to the vfs and path walk continues to the final component
> that you actually want to open which is that "xfs" directory in this
> example. We check the permissions on that inode via may_open("xfs") and
> we open that directory returning an fd to userspace ultimately.
>
> (I'm going by memory since I need to step out the door.)

So, let's say we do:

chgrp -R rostedt /sys/kernel/tracing/

But I don't want rostedt to have access to xfs

chgrp -R root /sys/kernel/tracing/events/xfs

Both actions will create the inodes and dentries of all files and
directories (because of "-R"). But once that is done, the ref counts go to
zero. They stay around until reclaim. But then I open Chrome ;-) and it
reclaims all the dentries and inodes, so we are back to here we were on
boot.

Now as rostedt I do:

ls /sys/kernel/tracing/events/xfs

The VFS layer doesn't know if I have permission to that or not, because all
the inodes and dentries have been freed. It has to call back to eventfs to
find out. Which the eventfs_root_lookup() and eventfs_iterate_shared() will
recreated the inodes with the proper permission.

Or are you saying that I don't need the ".permission" callback, because
eventfs does it when it creates the inodes? But for eventfs to know what
the permissions changes are, it uses .getattr and .setattr.

-- Steve