Re: [PATCH 2/2] mm: Make swap_readpage() for SWP_FS_OPS use ->direct_IO() not ->readpage()

From: Matthew Wilcox
Date: Thu Aug 12 2021 - 11:41:44 EST


On Thu, Aug 12, 2021 at 01:57:05PM +0100, David Howells wrote:
> Christoph Hellwig <hch@xxxxxx> wrote:
>
> > On Thu, Aug 12, 2021 at 12:57:58PM +0100, David Howells wrote:
> > > Make swap_readpage(), when accessing a swap file (SWP_FS_OPS) use
> > > the ->direct_IO() method on the filesystem rather then ->readpage().
> >
> > ->direct_IO is just a helper for ->read_iter and ->write_iter, so please
> > don't call it directly. It actually is slowly on its way out, with at
> > at least all of the iomap implementations not using it, as well as various
> > other file systems.
>
> [Note that __swap_writepage() uses ->direct_IO().]
>
> Calling ->write_iter is probably a bad idea here. Imagine that it goes
> through, say, generic_file_write_iter(), then __generic_file_write_iter() and
> then generic_file_direct_write(). It adds a number of delays into the system,
> including:
>
> - Taking the inode lock
> - Removing file privs
> - Cranking mtime, ctime, file version
> - Doing mnt_want_write
> - Setting the inode dirty
> - Waiting on pages in the range that are being written
> - Walking over the pagecache to invalidate the range
> - Redoing the invalidation (can't be skipped since page 0 is pinned)
>
> that we might want to skip as they'll end up being done for every page swapped
> out.

I agree with David; we want something lower-level for swap to call into.
I'd suggest aops->swap_rw and an implementation might well look
something like:

static ssize_t ext4_swap_rw(struct kiocb *iocb, struct iov_iter *iter)
{
return iomap_dio_rw(iocb, iter, &ext4_iomap_ops, NULL, 0);
}