Upcoming: Notifications, FS notifications and fsinfo()

From: David Howells
Date: Mon Mar 30 2020 - 09:58:36 EST



Hi Linus,

I have three sets of patches I'd like to push your way, if you (and Al) are
willing to consider them.

(1) General notification queue plus key/keyring notifications.

This adds the core of the notification queue built on pipes, and adds
the ability to watch for changes to keys.

(2) Mount and superblock notifications.

This builds on (1) to provide notifications of mount topology changes
and implements a framework for superblock events (configuration
changes, I/O errors, quota/space overruns and network status changes).

(3) Filesystem information retrieval.

This provides an extensible way to retrieve informational attributes
about mount objects and filesystems. This includes providing
information intended to make recovering from a notification queue
overrun much easier.

We need (1) for Gnome to efficiently watch for changes in kerberos
keyrings. Debarshi Ray has patches ready to go for gnome-online-accounts
so that it can make use of the facility.

Sets (2) and (3) can make libmount more efficient. Karel Zak is working on
making use of this to avoid reading /proc/mountinfo.

We need something to make systemd's watching of the mount topology more
efficient, and (2) and (3) can help with this by making it faster to narrow
down what changed. I think Karel has this in his sights, but hasn't yet
managed to work on it.

Set (2) should be able to make it easier to watch for mount options inside
a container, and set (3) should make it easier to examine the mounts inside
another mount namespace inside a container in a way that can't be done with
/proc/mounts. This is requested by Christian Brauner.

Jeff Layton has a tentative addition to (3) to expose error state to
userspace, and Andres Freund would like this for Postgres.

Set (3) further allows the information returned by such as statx() and
ioctl(FS_IOC_GETFLAGS) to be qualified by indicating which bits are/aren't
supported.

Further, for (3), I also allow filesystem-specific overrides/extensions to
fsinfo() and have a use for it to AFS to expose information about server
preference for a particular volume (something that is necessary for
implementing the toolset). I've provided example code that does similar
for NFS and some that exposes superblock info from Ext4. At Vault, Steve
expressed an interest in this for CIFS and Ted Ts'o expressed a possible
interest for Ext4.

Notes:

(*) These patches will conflict with apparently upcoming refactoring of
the security core, but the fixup doesn't look too bad:

https://lore.kernel.org/linux-next/20200330130636.0846e394@xxxxxxxxxxxxxxxx/T/#u

(*) MiklÃs Szeredi would much prefer to implement fsinfo() as a magic
filesystem mounted on /proc/self/fsinfo/ whereby your open fds appear
as directories under there, each with a set of attribute files
corresponding to the attributes that fsinfo() would otherwise provide.
To examine something by filename, you'd have to open it O_PATH and
then read the individual attribute files in the corresponding per-fd
directory. A readfile() system call has been mooted to elide the
{open,read,close} sequence to make it more efficient.

(*) James Bottomley would like to deprecate fsopen(), fspick(), fsconfig()
and fsmount() in favour of a more generic configfs with dedicated
open, set-config and action syscalls, with an additional get-config
syscall that would be used instead of fsinfo() - though, as I
understand it, you'd have to create a config (fspick-equivalent)
before you could use get-config.

(*) I don't think Al has particularly looked at fsinfo() or the fs
notifications patches yet.

(*) I'm not sure what *your* opinion of fsinfo() is yet. If you don't
dislike it too, um, fragrantly, would you be willing to entertain part
of it for now and prefer the rest to stew a bit longer? I can drop
some of the pieces.

Anyway, I'm going to formulate a pull request for each of them.

Thanks,
David