[PATCH v3 0/2] iov_iter: Convert the iterator macros into inline funcs

From: David Howells
Date: Wed Aug 16 2023 - 08:09:37 EST


Hi Al, Linus,

Here are a couple of patches to try and clean up the iov_iter iteration
stuff.

The first patch converts the iov_iter iteration macros to always-inline
functions to make the code easier to follow. It uses function pointers,
but they should get optimised away. The priv2 argument should likewise get
optimised away if unused.

The second patch makes _copy_from_iter() and copy_page_from_iter_atomic()
handle the ->copy_mc flag earlier and not in the step function. This flag
is only set by the coredump code and only with a BVEC iterator, so we can
have special out-of-line handling for this that uses iterate_bvec() rather
than iterate_and_advance() - thereby avoiding repeated checks on it in a
multi-element iterator.

Further changes I could make:

(1) Add an 'ITER_OTHER' type and an ops table pointer and have
iterate_and_advance2(), iov_iter_advance(), iov_iter_revert(),
etc. jump through it if it sees ITER_OTHER type. This would allow
types for, say, scatterlist, bio list, skbuff to be added without
further expanding the core.

(2) Move the ITER_XARRAY type to being an ITER_OTHER type. This would
shrink the core iterators quite a lot and reduce the stack usage as
the xarray walking stuff wouldn't be there.

(3) Move the iterate_*() functions into a header file so that bespoke
iterators can be created elsewhere. For instance, rbd has an
optimisation that requires it to scan to the buffer it is given to see
if it is all zeros. It would be nice if this could use
iterate_and_advance() - but that's buried inside lib/iov_iter.c.

Anyway, the overall changes in compiled function size for these patches on
x86_64 look like:

__copy_from_iter_mc new 0xd6
__export_symbol_iov_iter_init inc 0x3 -> 0x8 +0x5
_copy_from_iter inc 0x36e -> 0x380 +0x12
_copy_from_iter_flushcache inc 0x359 -> 0x364 +0xb
_copy_from_iter_nocache dcr 0x36a -> 0x33e -0x2c
_copy_mc_to_iter inc 0x3a7 -> 0x3bc +0x15
_copy_to_iter dcr 0x358 -> 0x34a -0xe
copy_page_from_iter_atomic.part.0 inc 0x3cf -> 0x3d4 +0x5
copy_page_to_iter_nofault.part.0 dcr 0x3f1 -> 0x3a9 -0x48
copyin del 0x30
copyout del 0x2d
copyout_mc del 0x2b
csum_and_copy_from_iter dcr 0x3e8 -> 0x3e5 -0x3
csum_and_copy_to_iter dcr 0x46a -> 0x446 -0x24
iov_iter_zero dcr 0x34f -> 0x338 -0x17
memcpy_from_iter.isra.0 del 0x1f

with __copy_from_iter_mc() being the out-of-line handling for ->copy_mc.

I've pushed the patches here also:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cleanup

David

Changes
=======
ver #3)
- Use min_t(size_t,) not min() to avoid a warning on Hexagon.
- Inline all the step functions.
- Added a patch to better handle copy_mc.

ver #2)
- Rebased on top of Willy's changes in linux-next.
- Change the checksum argument to the iteration functions to be a general
void* and use it to pass iter->copy_mc flag to memcpy_from_iter_mc() to
avoid using a function pointer.
- Arrange the end of the iterate_*() functions to look the same to give
the optimiser the best chance.
- Make iterate_and_advance() a wrapper around iterate_and_advance2().
- Adjust iterate_and_advance2() to use if-else-if-else-if-else rather than
switch(), to put ITER_BVEC before KVEC and to mark UBUF and IOVEC as
likely().
- Move "iter->count += progress" into iterate_and_advance2() from the
iterate functions.
- Mark a number of the iterator helpers with __always_inline.
- Fix _copy_from_iter_flushcache() to use memcpy_from_iter_flushcache()
not memcpy_from_iter().

Link: https://lore.kernel.org/r/3710261.1691764329@xxxxxxxxxxxxxxxxxxxxxx/ # v1
Link: https://lore.kernel.org/r/855.1692047347@xxxxxxxxxxxxxxxxxxxxxx/ # v2

David Howells (2):
iov_iter: Convert iterate*() to inline funcs
iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()

lib/iov_iter.c | 627 ++++++++++++++++++++++++++++++-------------------
1 file changed, 386 insertions(+), 241 deletions(-)