Re: Is ->sendmsg() allowed to change the msghdr struct it is given?

From: Paolo Abeni
Date: Tue Jun 27 2023 - 09:52:40 EST


On Tue, 2023-06-27 at 14:09 +0100, David Howells wrote:
> Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
>
> > udp_sendmsg() can set the MSG_TRUNC bit in msg->msg_flags, so I guess
> > that kind of actions are sort of allowed. Still, AFAICS, the kernel
> > based msghdr is not copied back to the user-space, so such change
> > should be almost a no-op in practice.
> >
> > @David: which would be the end goal for such action?
>
> Various places in the kernel use sock_sendmsg() - and I've added a bunch more
> with the MSG_SPLICE_PAGES patches. For some of the things I've added, there's
> a loop which used to call ->sendpage() and now calls sock_sendmsg(). In most
> of those places, msghdr will get reset each time round the loop - but not in
> all cases.
>
> Of particular immediate interest is net/ceph/messenger_v2.c. If you go to:
>
> https://lore.kernel.org/r/3111635.1687813501@xxxxxxxxxxxxxxxxxxxxxx/
>
> and look at the resultant code:
>
> static int do_sendmsg(struct socket *sock, struct iov_iter *it)
> {
> struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
> int ret;
>
> msg.msg_iter = *it;
> while (iov_iter_count(it)) {
> ret = sock_sendmsg(sock, &msg);
> if (ret <= 0) {
> if (ret == -EAGAIN)
> ret = 0;
> return ret;
> }
>
> iov_iter_advance(it, ret);
> }
>
> WARN_ON(msg_data_left(&msg));
> return 1;
> }
>
> for example. It could/would malfunction if sendmsg() is allowed to modify
> msghdr - or if it doesn't update msg_iter. Likewise:
>
> static int do_try_sendpage(struct socket *sock, struct iov_iter *it)
> {
> struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
> struct bio_vec bv;
> int ret;
>
> if (WARN_ON(!iov_iter_is_bvec(it)))
> return -EINVAL;
>
> while (iov_iter_count(it)) {
> /* iov_iter_iovec() for ITER_BVEC */
> bvec_set_page(&bv, it->bvec->bv_page,
> min(iov_iter_count(it),
> it->bvec->bv_len - it->iov_offset),
> it->bvec->bv_offset + it->iov_offset);
>
> /*
> * MSG_SPLICE_PAGES cannot properly handle pages with
> * page_count == 0, we need to fall back to sendmsg if
> * that's the case.
> *
> * Same goes for slab pages: skb_can_coalesce() allows
> * coalescing neighboring slab objects into a single frag
> * which triggers one of hardened usercopy checks.
> */
> if (sendpage_ok(bv.bv_page))
> msg.msg_flags |= MSG_SPLICE_PAGES;
> else
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
>
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len);
> ret = sock_sendmsg(sock, &msg);
> if (ret <= 0) {
> if (ret == -EAGAIN)
> ret = 0;
> return ret;
> }
>
> iov_iter_advance(it, ret);
> }
>
> return 1;
> }
>
> could be similarly affected if ->sendmsg() mucks about with msg_flags.

With some help from the compiler - locally changing the proto_ops and
proto sendmsg definition and handling the fallout - I found the
following:

- mptcp_sendmsg() is clearing unsupported msg_flags
(I should have recalled this one even without much testing ;)

- udpv4_sendmsg() is setting msg_name/msg_namelen
- tls_device_sendmsg() is clearing MSG_SPLICE_PAGE when zerocopy is not
supported
- unix_seqpacket_sendmsg() is clearing msg_namelen

I could have missed something, but the above looks safe for the use-
case you mentioned.

Cheers,

Paolo