Re: [PATCH] tracefs/eventfs: Use root and instance inodes as default ownership

From: Matthew Wilcox
Date: Thu Jan 04 2024 - 14:03:45 EST


On Thu, Jan 04, 2024 at 10:05:44AM -0500, Steven Rostedt wrote:
> > file_system_type: what filesystem instances belong to. Not quite the same
> > thing as fs driver (one driver can provide several of those). Usually
> > it's 1-to-1, but that's not required (e.g. NFS vs NFSv4, or ext[234], or...).
>
> I don't know the difference between NFS and NFSv4 as I just used whatever
> was the latest. But I understand the ext[234] part.

What Al's sying is that nfs.ko provides both nfs_fs_type and
nfs4_fs_type. ext4.ko provides ext2_fs_type, ext3_fs_type and
ext4_fs_type. This is allowed but anomalous. Most filesystems provide
only one, eg ocfs2_fs_type.

> >
> > super_block: individual filesystem instance. Hosts dentry tree (connected or
> > several disconnected parts - think NFSv4 or the state while trying to get
> > a dentry by fhandle, etc.).
>
> I don't know how NFSv4 works, I'm only a user of it, I never actually
> looked at the code. So that's not the best example, at least for me.

Right, so NFS (v4 or otherwise) is Special. In the protocol, files
are identified by a thing called an fhandle. This is (iirc) a 32-byte
identifier which must persist across server reboot. Originally it was
probably supposed to encode dev_t plus ino_t plus generation number.
But you can do all kinds of things in the NFS protocol with an fhandle
that you need a dentry for in Linux (like path walks). Unfortunately,
clients can't be told "Hey, we've lost context, please rewalk" (which
would have other problems anyway), so we need a way to find the dentry
for an fhandle. I understand this very badly, but essentially we end
up looking for canonical ones, and then creating isolated trees of
dentries if we can't find them. Sometimes we then graft these isolated
trees into the canonical spots if we end up connecting them through
various filesystem activity.

At least that's my understanding which probably contains several
misunderstandings.

> > Filesystem object contents belongs here; multiple hardlinks
> > have different dentries and the same inode.
>
> So, can I assume that an inode could only have as many dentries as hard
> links? I know directories are only allowed to have a single hard link. Is
> that why they can only have a single dentry?

There could be more. For example, I could open("A"); ln("A", "B");
open("B"); rm("A"); ln("B", "C"); open("C"); rm("B").

Now there are three dentries for this inode, its link count is currently
one and never exceeded two.

> Thanks for this overview. It was very useful, and something I think we
> should add to kernel doc. I did read Documentation/filesystems/vfs.rst but
> honestly, I think your writeup here is a better overview.

Documentation/filesystems/locking.rst is often a better source, although
the two should really be merged. Not for the faint-hearted.