Re: [PATCH] eventfs: Stop using dcache_readdir() for getdents()

From: Steven Rostedt
Date: Wed Jan 03 2024 - 14:52:17 EST


On Wed, 3 Jan 2024 10:38:09 -0800
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> @@ -332,10 +255,8 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
> if (!remount || opts->opts & BIT(Opt_uid))
> inode->i_uid = opts->uid;
>
> - if (!remount || opts->opts & BIT(Opt_gid)) {
> - /* Set all the group ids to the mount option */
> - set_gid(sb->s_root, opts->gid);
> - }
> + if (!remount || opts->opts & BIT(Opt_gid))
> + inode->i_gid = opts->gid;
>
> return 0;
> }

This doesn't work because for tracefs (not eventfs) the dentries are
created at boot up and before the file system is mounted. This means you
can't even set a gid in /etc/fstab. This will cause a regression.

tracefs was designed after debugfs, which also ignores gid. But because
there's users out there that want non-root accounts to have access to
tracing, it is documented to set the gid to a group that you can then add
users to. And that's the reason behind the set_gid() walk.

Reverting that one commit won't fix things either, because it only blocked
OTH to be read, but the creation of the files changed their mode's passed
to block OTH as well, so all those would need to be changed too. And I
don't think making the trace files open to OTH is a good solution, even if
the tracefs top level directory itself blocks other. The issue was that the
user use to just mount the top level to allow the group access to the files
below, which allowed all users access. But this is weak control of the file
system.

Even my non-test machines have me in the tracing group so my user account
has access to tracefs.

On boot up, all the tracefs files are created via tracefs_create_file() and
directories by tracefs_create_dir() which was copied from
debugfs_create_file/dir(). At this moment, the dentry is created with the
permissions set. There's no looking at the super block.

So we need a way to change the permissions at mount time.

The only solution I can think of that doesn't include walking the current
dentries, is to convert all of tracefs to be more like eventfs, and have
the dentries created on demand. But perhaps, different than eventfs, they
do not need to be freed when they are no longer referenced, which should
make it easier to implement. And there's not nearly as many files and
directories, so keeping meta data around isn't as much of an issue.

Instead of creating the inode and dentry in the tracefs_create_file/dir(),
it could just create a descriptor that holds the fops, data and mode. Then
on lookup, it would create the inodes and dentries similar to eventfs.

It would need its own iterate_shared as well.

-- Steve