Re: [RFC net-next v3 23/29] io_uring: allow to pass addr into sendzc

From: Pavel Begunkov
Date: Wed Jun 29 2022 - 05:57:14 EST


On 6/29/22 08:42, Stefan Metzmacher wrote:

Hi Pavel,

+    if (zc->addr) {
+        ret = move_addr_to_kernel(zc->addr, zc->addr_len, &address);
+        if (unlikely(ret < 0))
+            return ret;
+        msg.msg_name = (struct sockaddr *)&address;
+        msg.msg_namelen = zc->addr_len;
+    }
+

Given that this fills in msg almost completely can we also have
a version of SENDMSGZC, it would be very useful to also allow
msg_control to be passed and as well as an iovec.

Would that be possible?

Right, I left it to follow ups as the series is already too long.

fwiw, I'm going to also add addr to IORING_OP_SEND.


Do I understand it correctly, that the reason for the new opcode is,
that IO_OP_SEND would already work with existing MSG_ZEROCOPY behavior, together
with the recvmsg based completion?

Right, it should work with MSG_ZEROCOPY, but with a different notification
semantics, would need recvmsg from error queues, and with performance
implications.


In addition I wondering if a completion based on msg_iocb->ki_complete() (indicated by EIOCBQUEUED)
what have also worked, just deferring the whole sendmsg operation until all buffers are no longer used.
That way it would be possible to buffers are acked by the remote end when it comes back to the application
layer.

There is msg_iocb, but it's mostly unused by protocols, IIRC apart
from crypto sockets. And then we'd need to repeat the path of
ubuf_info to handle stuff like skb splitting and perhaps also
changing rules for ->ki_complete


I'm also wondering if the ki_complete() based approach should always be provided to sock_sendmsg()
triggered by io_uring (independend of the new zerocopy stuff), it would basically work very simular to
the uring_cmd() completions, which are able to handle both true async operation indicated by EIOCBQUEUED
as well as EAGAIN triggered path via io-wq.

Would be even more similar to how we has always been doing
read/write, and rw requests do pass in a msg_iocb, but again,
it's largely ignored internally.

--
Pavel Begunkov