Re: [PATCH v3] fs/buffer.c: update per-CPU bh_lru cache via RCU

From: Matthew Wilcox
Date: Thu Feb 02 2023 - 17:51:41 EST


On Fri, Feb 03, 2023 at 09:36:53AM +1100, Dave Chinner wrote:
> On Wed, Feb 01, 2023 at 01:01:47PM -0300, Marcelo Tosatti wrote:
> >
> > umount calls invalidate_bh_lrus which IPIs each
>
> via invalidate_bdev(). So this is only triggered on unmount of
> filesystems that use the block device mapping directly, right?
>
> Or is the problem that userspace is polling the block device (e.g.
> udisks, blkid, etc) whilst the filesystem is mounted and populating
> the block device mapping with cached pages so invalidate_bdev()
> always does work even when the filesystem doesn't actually use the
> bdev mapping?
>
> > CPU that has non empty per-CPU buffer_head cache:
> >
> > on_each_cpu_cond(has_bh_in_lru, invalidate_bh_lru, NULL, 1);
> >
> > This interrupts CPUs which might be executing code sensitive
> > to interferences.
> >
> > To avoid the IPI, free the per-CPU caches remotely via RCU.
> > Two bh_lrus structures for each CPU are allocated: one is being
> > used (assigned to per-CPU bh_lru pointer), and the other is
> > being freed (or idle).
>
> Rather than adding more complexity to the legacy bufferhead code,
> wouldn't it be better to switch the block device mapping to use
> iomap+folios and get rid of the use of bufferheads altogether?

Pretty sure ext4's journalling relies on the blockdev using
buffer_heads. At least, I did a conversion of blockdev to use
mpage_readahead() and ext4 stopped working.