Re: [PATCH] tracefs/eventfs: Use root and instance inodes as default ownership

From: Christian Brauner
Date: Fri Jan 05 2024 - 09:26:46 EST


On Wed, Jan 03, 2024 at 08:32:46PM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
>
> Instead of walking the dentries on mount/remount to update the gid values of
> all the dentries if a gid option is specified on mount, just update the root
> inode. Add .getattr, .setattr, and .permissions on the tracefs inode
> operations to update the permissions of the files and directories.
>
> For all files and directories in the top level instance:
>
> /sys/kernel/tracing/*
>
> It will use the root inode as the default permissions. The inode that
> represents: /sys/kernel/tracing (or wherever it is mounted).
>
> When an instance is created:
>
> mkdir /sys/kernel/tracing/instance/foo
>
> The directory "foo" and all its files and directories underneath will use
> the default of what foo is when it was created. A remount of tracefs will
> not affect it.

That kinda sounds like eventfs should actually be a separate filesystem.
But I don't know enough about the relationship between the two concepts.

>
> If a user were to modify the permissions of any file or directory in
> tracefs, it will also no longer be modified by a change in ownership of a
> remount.

Very odd semantics and I would recommend to avoid that. It's just plain
weird imo.

>
> The events directory, if it is in the top level instance, will use the
> tracefs root inode as the default ownership for itself and all the files and
> directories below it.
>
> For the events directory in an instance ("foo"), it will keep the ownership
> of what it was when it was created, and that will be used as the default
> ownership for the files and directories beneath it.
>
> Link: https://lore.kernel.org/linux-trace-kernel/CAHk-=wjVdGkjDXBbvLn2wbZnqP4UsH46E3gqJ9m7UG6DpX2+WA@xxxxxxxxxxxxxx/
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> ---

So tracefs supports remounting with different uid/gid mount options and
then actually wades through _all_ of the inodes and changes their
ownership internally? What's the use-case for this? Containers?

Aside from optimizing this and the special semantics for this eventfs
stuff that you really should think twice of doing, here's one idea for
an extension that might alleviate some of the pain:

If you need flexible dynamic ownership change to e.g., be able to
delegate (all, a directory, a single file of) tracefs to
unprivileged/containers/whatever then you might want to consider
supporting idmapped mounts for tracefs. Because then you can do stuff
like:

user1@localhost:~/data/scripts$ sudo mount --bind -o X-mount.idmap='g:0:1000:1 u:0:1234:1' /run/ /mnt
user1@localhost:~/data/scripts$ ls -ln /run/
total 12
drwxr-xr-x 2 0 0 40 Jan 5 12:12 credentials
drwx------ 2 0 0 40 Jan 5 11:57 cryptsetup
drwxr-xr-x 2 0 0 60 Jan 5 11:57 dbus
drwx------ 6 0 0 280 Jan 5 11:57 incus_agent
prw------- 1 0 0 0 Jan 5 11:57 initctl
drwxrwxrwt 4 0 0 80 Jan 5 11:57 lock
drwxr-xr-x 3 0 0 60 Jan 5 11:57 log
drwx------ 2 0 0 40 Jan 5 11:57 lvm
-r--r--r-- 1 0 0 33 Jan 5 11:57 machine-id
-rw-r--r-- 1 0 0 101 Jan 5 11:58 motd.dynamic
drwxr-xr-x 2 0 0 40 Jan 5 11:57 mount
drwx------ 2 0 0 40 Jan 5 11:57 multipath
drwxr-xr-x 2 0 0 40 Jan 5 11:57 sendsigs.omit.d
lrwxrwxrwx 1 0 0 8 Jan 5 11:57 shm -> /dev/shm
drwx--x--x 2 0 0 40 Jan 5 11:57 sudo
drwxr-xr-x 24 0 0 660 Jan 5 14:30 systemd
drwxr-xr-x 6 0 0 140 Jan 5 14:30 udev
drwxr-xr-x 4 0 0 80 Jan 5 11:58 user
-rw-rw-r-- 1 0 43 2304 Jan 5 15:15 utmp

user1@localhost:~/data/scripts$ ls -ln /mnt/
total 12
drwxr-xr-x 2 1234 1000 40 Jan 5 12:12 credentials
drwx------ 2 1234 1000 40 Jan 5 11:57 cryptsetup
drwxr-xr-x 2 1234 1000 60 Jan 5 11:57 dbus
drwxr-xr-x 2 1234 1000 40 Jan 5 11:57 incus_agent
prw------- 1 1234 1000 0 Jan 5 11:57 initctl
drwxr-xr-x 2 1234 1000 40 Jan 5 11:57 lock
drwxr-xr-x 3 1234 1000 60 Jan 5 11:57 log
drwx------ 2 1234 1000 40 Jan 5 11:57 lvm
-r--r--r-- 1 1234 1000 33 Jan 5 11:57 machine-id
-rw-r--r-- 1 1234 1000 101 Jan 5 11:58 motd.dynamic
drwxr-xr-x 2 1234 1000 40 Jan 5 11:57 mount
drwx------ 2 1234 1000 40 Jan 5 11:57 multipath
drwxr-xr-x 2 1234 1000 40 Jan 5 11:57 sendsigs.omit.d
lrwxrwxrwx 1 1234 1000 8 Jan 5 11:57 shm -> /dev/shm
drwx--x--x 2 1234 1000 40 Jan 5 11:57 sudo
drwxr-xr-x 24 1234 1000 660 Jan 5 14:30 systemd
drwxr-xr-x 6 1234 1000 140 Jan 5 14:30 udev
drwxr-xr-x 4 1234 1000 80 Jan 5 11:58 user
-rw-rw-r-- 1 1234 65534 2304 Jan 5 15:15 utmp

Where you can see that ownership of this tmpfs instance in this example
is changed. I'm not trying to advocate here but this will probably
ultimately be nicer for your users because it means that a container
manager or whatever can be handed a part of tracefs (or all of it) and
the ownership and access rights for that thing is correct. And you can
get rid of that gid based access completely.

You can change uids, gids, or both. You can specify up to 340 individual
mappings it's quite flexible.

Because then you can have a single tracefs superblock and have multiple
mounts with different ownership for the relevant parts of tracefs that
you want to delegate to whoever. If you need an ownership change you can
then just create another idmapped mount with the new ownership and then
use MOVE_MOUNT_BENEATH + umount to replace that mount.

Probably even know someone that would implement this for you (not me) if
that sounds like something that would cover some of the use-case for the
proposed change here. But maybe I just misunderstood things completely.