[GIT PULL] vfs super updates

From: Christian Brauner
Date: Fri Jan 05 2024 - 07:42:00 EST


Hey Linus,

/* Summary */
This contains the super work for this cycle including the long-awaited series
by Jan to make it possible to prevent writing to mounted block devices:

* Writing to mounted devices is dangerous and can lead to filesystem
corruption as well as crashes. Furthermore syzbot comes with more and more
involved examples how to corrupt block device under a mounted filesystem
leading to kernel crashes and reports we can do nothing about. Add tracking
of writers to each block device and a kernel cmdline argument which controls
whether other writeable opens to block devices open with
BLK_OPEN_RESTRICT_WRITES flag are allowed.

Note that this effectively only prevents modification of the particular block
device's page cache by other writers. The actual device content can still be
modified by other means - e.g. by issuing direct scsi commands, by doing
writes through devices lower in the storage stack (e.g. in case loop devices,
DM, or MD are involved) etc. But blocking direct modifications of the block
device page cache is enough to give filesystems a chance to perform data
validation when loading data from the underlying storage and thus prevent
kernel crashes.

Syzbot can use this cmdline argument option to avoid uninteresting crashes.
Also users whose userspace setup does not need writing to mounted block
devices can set this option for hardening. We expect that this will be
interesting to quite a few workloads.

Btrfs is currently opted out of this because they still haven't merged
patches we require for this to work from three kernel releases ago.

* Reimplement block device freezing and thawing as holder operations on the
block device.

This allows us to extend block device freezing to all devices associated with
a superblock and not just the main device. It also allows us to remove
get_active_super() and thus another function that scans the global list of
superblocks.

Freezing via additional block devices only works if the filesystem chooses to
use @fs_holder_ops for these additional devices as well. That currently only
includes ext4 and xfs.

Earlier releases switched get_tree_bdev() and mount_bdev() to use
@fs_holder_ops. The remaining nilfs2 open-coded version of mount_bdev() has
been converted to rely on @fs_holder_ops as well. So block device freezing
for the main block device will continue to work as before.

There should be no regressions in functionality. The only special case is
btrfs where block device freezing for the main block device never worked
because sb->s_bdev isn't set. Block device freezing for btrfs can be fixed
once they can switch to @fs_holder_ops but that can happen whenever they're
ready.

* Various cleanups.

/* Testing */
clang: Debian clang version 16.0.6 (19)
gcc: (Debian 13.2.0-7) 13.2.0

All patches are based on v6.7-rc1 and have been sitting in linux-next.
No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

[1] linux-next: manual merge of the vfs-brauner tree with Linus' tree
https://lore.kernel.org/linux-next/20231204103510.0eb5ea5f@xxxxxxxxxxxxxxxx

Merge conflicts with other trees
================================

[1] linux-next: manual merge of the vfs-brauner tree with the btrfs tree
https://lore.kernel.org/linux-next/20231127092001.54a021e8@xxxxxxxxxxxxxxxx

The needed fix is presented in:

https://lore.kernel.org/linux-next/20231128213344.GA3423530@dev-arch.thelio-3990X

[2] linux-next: manual merge of the vfs tree with the vfs-brauner tree
https://lore.kernel.org/linux-next/20231220104110.56ae9b36@xxxxxxxxxxxxxxxx

The following changes since commit b85ea95d086471afb4ad062012a4d73cd328fa86:

Linux 6.7-rc1 (2023-11-12 16:19:07 -0800)

are available in the Git repository at:

git@xxxxxxxxxxxxxxxxxxx:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.8.super

for you to fetch changes up to 8ff363ade395e72dc639810b6f59849c743c363e:

block: Fix a memory leak in bdev_open_by_dev() (2023-12-28 11:48:17 +0100)

Please consider pulling these changes from the signed vfs-6.8.super tag.

Happy New Year!
Christian

----------------------------------------------------------------
vfs-6.8.super

----------------------------------------------------------------
Christian Brauner (17):
fs: massage locking helpers
bdev: rename freeze and thaw helpers
bdev: surface the error from sync_blockdev()
bdev: add freeze and thaw holder operations
bdev: implement freeze and thaw holder operations
fs: remove get_active_super()
super: remove bd_fsfreeze_sb
fs: remove unused helper
porting: document block device freeze and thaw changes
blkdev: comment fs_holder_ops
fs: simplify setup_bdev_super() calls
xfs: simplify device handling
ext4: simplify device handling
fs: remove dead check
fs: handle freezing from multiple devices
super: massage wait event mechanism
super: don't bother with WARN_ON_ONCE()

Christoph Hellwig (1):
fs: streamline thaw_super_locked

Christophe JAILLET (1):
block: Fix a memory leak in bdev_open_by_dev()

Jan Kara (8):
nilfs2: simplify device handling
bcachefs: Convert to bdev_open_by_path()
block: Remove blkdev_get_by_*() functions
block: Add config option to not allow writing to mounted devices
btrfs: Do not restrict writes to btrfs devices
fs: Block writes to mounted block devices
xfs: Block writes to log device
ext4: Block writes to journal device

Documentation/filesystems/porting.rst | 12 +
block/Kconfig | 20 ++
block/bdev.c | 258 ++++++++++--------
drivers/md/dm.c | 4 +-
fs/bcachefs/fs-ioctl.c | 4 +-
fs/bcachefs/super-io.c | 19 +-
fs/bcachefs/super_types.h | 1 +
fs/btrfs/super.c | 2 +
fs/ext4/ioctl.c | 4 +-
fs/ext4/super.c | 8 +-
fs/f2fs/file.c | 4 +-
fs/nilfs2/super.c | 8 -
fs/super.c | 498 +++++++++++++++++++---------------
fs/xfs/xfs_fsops.c | 4 +-
fs/xfs/xfs_super.c | 24 +-
include/linux/blk_types.h | 8 +-
include/linux/blkdev.h | 29 +-
include/linux/fs.h | 19 +-
18 files changed, 531 insertions(+), 395 deletions(-)