Re: [PATCH net-next v2] libceph: Partially revert changes to support MSG_SPLICE_PAGES

From: Ilya Dryomov
Date: Tue Jun 27 2023 - 09:26:24 EST


On Mon, Jun 26, 2023 at 11:05 PM David Howells <dhowells@xxxxxxxxxx> wrote:
>
> Fix the mishandling of MSG_DONTWAIT and also reinstates the per-page
> checking of the source pages (which might have come from a DIO write by
> userspace) by partially reverting the changes to support MSG_SPLICE_PAGES
> and doing things a little differently. In messenger_v1:
>
> (1) The ceph_tcp_sendpage() is resurrected and the callers reverted to use
> that.
>
> (2) The callers now pass MSG_MORE unconditionally. Previously, they were
> passing in MSG_MORE|MSG_SENDPAGE_NOTLAST and then degrading that to
> just MSG_MORE on the last call to ->sendpage().
>
> (3) Make ceph_tcp_sendpage() a wrapper around sendmsg() rather than
> sendpage(), setting MSG_SPLICE_PAGES if sendpage_ok() returns true on
> the page.
>
> In messenger_v2:
>
> (4) Bring back do_try_sendpage() and make the callers use that.
>
> (5) Make do_try_sendpage() use sendmsg() for both cases and set
> MSG_SPLICE_PAGES if sendpage_ok() is set.
>
> Fixes: 40a8c17aa770 ("ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage")
> Fixes: fa094ccae1e7 ("ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()")
> Reported-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> Link: https://lore.kernel.org/r/CAOi1vP9vjLfk3W+AJFeexC93jqPaPUn2dD_4NrzxwoZTbYfOnw@xxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/r/CAOi1vP_Bn918j24S94MuGyn+Gxk212btw7yWeDrRcW1U8pc_BA@xxxxxxxxxxxxxx/
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Ilya Dryomov <idryomov@xxxxxxxxx>
> cc: Xiubo Li <xiubli@xxxxxxxxxx>
> cc: Jeff Layton <jlayton@xxxxxxxxxx>
> cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> cc: Paolo Abeni <pabeni@xxxxxxxxxx>
> cc: Jens Axboe <axboe@xxxxxxxxx>
> cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> cc: ceph-devel@xxxxxxxxxxxxxxx
> cc: netdev@xxxxxxxxxxxxxxx
> Link: https://lore.kernel.org/r/3101881.1687801973@xxxxxxxxxxxxxxxxxxxxxx/ # v1
> ---
> Notes:
> ver #2)
> - Removed mention of MSG_SENDPAGE_NOTLAST in comments.
> - Changed some refs to sendpage to MSG_SPLICE_PAGES in comments.
> - Init msg_iter in ceph_tcp_sendpage().
> - Move setting of MSG_SPLICE_PAGES in do_try_sendpage() next to comment
> and adjust how it is cleared.
>
> net/ceph/messenger_v1.c | 58 ++++++++++++++++++++-----------
> net/ceph/messenger_v2.c | 88 ++++++++++++++++++++++++++++++++++++++----------
> 2 files changed, 107 insertions(+), 39 deletions(-)
>
> diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
> index 814579f27f04..51a6f28aa798 100644
> --- a/net/ceph/messenger_v1.c
> +++ b/net/ceph/messenger_v1.c
> @@ -74,6 +74,39 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov,
> return r;
> }
>
> +/*
> + * @more: MSG_MORE or 0.
> + */
> +static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
> + int offset, size_t size, int more)
> +{
> + struct msghdr msg = {
> + .msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL | more,
> + };
> + struct bio_vec bvec;
> + int ret;
> +
> + /*
> + * MSG_SPLICE_PAGES cannot properly handle pages with page_count == 0,
> + * we need to fall back to sendmsg if that's the case.
> + *
> + * Same goes for slab pages: skb_can_coalesce() allows
> + * coalescing neighboring slab objects into a single frag which
> + * triggers one of hardened usercopy checks.
> + */
> + if (sendpage_ok(page))
> + msg.msg_flags |= MSG_SPLICE_PAGES;
> +
> + bvec_set_page(&bvec, page, size, offset);
> + iov_iter_bvec(&msg.msg_iter, ITER_DEST, &bvec, 1, size);

Hi David,

Shouldn't this be ITER_SOURCE?

Thanks,

Ilya