Re: [PATCH net] ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()

From: Willem de Bruijn
Date: Tue Sep 19 2023 - 16:34:35 EST


On Tue, Sep 19, 2023 at 12:12 PM David Howells <dhowells@xxxxxxxxxx> wrote:
>
>
> Including the transhdrlen in length is a problem when the packet is
> partially filled (e.g. something like send(MSG_MORE) happened previously)
> when appending to an IPv4 or IPv6 packet as we don't want to repeat the
> transport header or account for it twice. This can happen under some
> circumstances, such as splicing into an L2TP socket.
>
> The symptom observed is a warning in __ip6_append_data():
>
> WARNING: CPU: 1 PID: 5042 at net/ipv6/ip6_output.c:1800 __ip6_append_data.isra.0+0x1be8/0x47f0 net/ipv6/ip6_output.c:1800
>
> that occurs when MSG_SPLICE_PAGES is used to append more data to an already
> partially occupied skbuff. The warning occurs when 'copy' is larger than
> the amount of data in the message iterator. This is because the requested
> length includes the transport header length when it shouldn't. This can be
> triggered by, for example:
>
> sfd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_L2TP);
> bind(sfd, ...); // ::1
> connect(sfd, ...); // ::1 port 7
> send(sfd, buffer, 4100, MSG_MORE);
> sendfile(sfd, dfd, NULL, 1024);
>
> Fix this by pushing the addition of transhdrlen into length down into
> __ip_append_data() and __ip6_append_data(), making it conditional on the
> write queue being empty (otherwise we just clear transhdrlen).

I'm afraid that we might start to dig an ever deeping hole.

The proposed fix is non-trivial, and changes not just the new path
that observes the issue (MSG_SPLICE_PAGES), but also the other more
common paths that exercise __ip6_append_data.

There is significant risk to introduce an unintended side effect
requiring a follow-up fix. Because this function is notoriously
complex, multiplexing a lot of behavior: with and without transport
headers, edge cases like fragmentation, MSG_MORE, absence of
scatter-gather, ....

Does the issue discovered only affect MSG_SPLICE_PAGES or can it
affect other paths too? If the first, it possible to create a more
targeted fix that can trivially be seen to not affect code prior to
introduction of splice pages?