Re: [PATCH v12 00/10] iov_iter: Improve page extraction (pin or just list)

From: Jens Axboe
Date: Tue Feb 07 2023 - 13:49:40 EST


On 2/7/23 10:12 AM, David Howells wrote:
> Hi Jens, Al, Christoph,
>
> Here are patches to provide support for extracting pages from an iov_iter
> and to use this in the extraction functions in the block layer bio code.
>
> The patches make the following changes:
>
> (1) Change generic_file_splice_read() to load up an ITER_BVEC iterator
> with sufficient pages and use that rather than using an ITER_PIPE.
> This avoids a problem[2] when __iomap_dio_rw() calls iov_iter_revert()
> to shorten an iterator when it races with truncation. The reversion
> causes the pipe iterator to prematurely release the pages it was
> retaining - despite the read still being in progress. This caused
> memory corruption.
>
> (2) Remove ITER_PIPE and its paraphernalia as generic_file_splice_read()
> was the only user.
>
> (3) Add a function, iov_iter_extract_pages() to replace
> iov_iter_get_pages*() that gets refs, pins or just lists the pages as
> appropriate to the iterator type.
>
> Add a function, iov_iter_extract_will_pin() that will indicate from
> the iterator type how the cleanup is to be performed, returning true
> if the pages will need unpinning, false otherwise.
>
> (4) Make the bio struct carry a pair of flags to indicate the cleanup
> mode. BIO_NO_PAGE_REF is replaced with BIO_PAGE_REFFED (indicating
> FOLL_GET was used) and BIO_PAGE_PINNED (indicating FOLL_PIN was used)
> is added.
>
> BIO_PAGE_REFFED will go away, but at the moment fs/direct-io.c sets it
> and this series does not fully address that file.
>
> (5) Add a function, bio_release_page(), to release a page appropriately to
> the cleanup mode indicated by the BIO_PAGE_* flags.
>
> (6) Make the iter-to-bio code use iov_iter_extract_pages() to retain the
> pages appropriately and clean them up later.
>
> (7) Fix bio_flagged() so that it doesn't prevent a gcc optimisation.

I've updated the for-6.3/iov-extract branch and the for-next branch. This
isn't done to bypass any review, just so we can get some more testing on
this (and because the old one is known broken).

--
Jens Axboe