[PATCH net-next v5 05/16] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage

From: David Howells
Date: Fri Jun 23 2023 - 18:57:10 EST


When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced.

To make this work, the data is assembled in a bio_vec array and attached to
a BVEC-type iterator.

Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
cc: Santosh Shilimkar <santosh.shilimkar@xxxxxxxxxx>
cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
cc: Eric Dumazet <edumazet@xxxxxxxxxx>
cc: Jakub Kicinski <kuba@xxxxxxxxxx>
cc: Paolo Abeni <pabeni@xxxxxxxxxx>
cc: Jens Axboe <axboe@xxxxxxxxx>
cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
cc: linux-rdma@xxxxxxxxxxxxxxx
cc: rds-devel@xxxxxxxxxxxxxx
cc: netdev@xxxxxxxxxxxxxxx
---

Notes:
ver #4)
- Reduce change to only call sendmsg on a page at a time.

net/rds/tcp_send.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c
index 8c4d1d6e9249..7d284ac7e81a 100644
--- a/net/rds/tcp_send.c
+++ b/net/rds/tcp_send.c
@@ -72,9 +72,10 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
{
struct rds_conn_path *cp = rm->m_inc.i_conn_path;
struct rds_tcp_connection *tc = cp->cp_transport_data;
+ struct msghdr msg = {};
+ struct bio_vec bvec;
int done = 0;
int ret = 0;
- int more;

if (hdr_off == 0) {
/*
@@ -111,15 +112,17 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
goto out;
}

- more = rm->data.op_nents > 1 ? (MSG_MORE | MSG_SENDPAGE_NOTLAST) : 0;
while (sg < rm->data.op_nents) {
- int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more;
-
- ret = tc->t_sock->ops->sendpage(tc->t_sock,
- sg_page(&rm->data.op_sg[sg]),
- rm->data.op_sg[sg].offset + off,
- rm->data.op_sg[sg].length - off,
- flags);
+ msg.msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL;
+ if (sg + 1 < rm->data.op_nents)
+ msg.msg_flags |= MSG_MORE;
+
+ bvec_set_page(&bvec, sg_page(&rm->data.op_sg[sg]),
+ rm->data.op_sg[sg].length - off,
+ rm->data.op_sg[sg].offset + off);
+ iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1,
+ rm->data.op_sg[sg].length - off);
+ ret = sock_sendmsg(tc->t_sock, &msg);
rdsdebug("tcp sendpage %p:%u:%u ret %d\n", (void *)sg_page(&rm->data.op_sg[sg]),
rm->data.op_sg[sg].offset + off, rm->data.op_sg[sg].length - off,
ret);
@@ -132,8 +135,6 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
off = 0;
sg++;
}
- if (sg == rm->data.op_nents - 1)
- more = 0;
}

out: