Re: [PATCH bpf-next] net: Fix slab-out-of-bounds in inet[6]_steal_sock

From: Martin KaFai Lau
Date: Fri Aug 11 2023 - 23:35:48 EST


On 8/9/23 10:12 AM, Martin KaFai Lau wrote:
On 8/9/23 8:55 AM, Kuniyuki Iwashima wrote:
From: Lorenz Bauer <lmb@xxxxxxxxxxxxx>
Date: Wed, 9 Aug 2023 16:08:31 +0100
On Wed, Aug 9, 2023 at 3:39 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:

On 8/9/23 1:33 AM, Lorenz Bauer wrote:
Kumar reported a KASAN splat in tcp_v6_rcv:

    bash-5.2# ./test_progs -t btf_skc_cls_ingress
    ...
    [   51.810085] BUG: KASAN: slab-out-of-bounds in tcp_v6_rcv+0x2d7d/0x3440
    [   51.810458] Read of size 2 at addr ffff8881053f038c by task test_progs/226

The problem is that inet[6]_steal_sock accesses sk->sk_protocol without
accounting for request sockets. I added the check to ensure that we only
every try to perform a reuseport lookup on a supported socket.

It turns out that this isn't necessary at all. struct sock_common contains
a skc_reuseport flag which indicates whether a socket is part of a

Does it go back to the earlier discussion
(https://lore.kernel.org/bpf/7188429a-c380-14c8-57bb-9d05d3ba4e5e@xxxxxxxxx/)
that the sk->sk_reuseport is 1 from sk_clone for TCP_ESTABLISHED? It works
because there is sk->sk_reuseport"_cb" check going deeper into
reuseport_select_sock() but there is an extra inet6_ehashfn for all TCP_ESTABLISHED.

Sigh, I'd forgotten about this...

For the TPROXY TCP replacement use case we sk_assign the SYN to the
listener, which creates the reqsk. We can let follow up packets pass
without sk_assign since they will match the reqsk and convert to a
fullsock via the usual route. At least that is what the test does. I'm
not even sure what it means to redirect a random packet into an
established TCP socket TBH. It'd probably be dropped?

It could act like an earlier early-demux for established sk? If the bpf prog has already looked up an established sk for other needs (eg. reading the sk local storage), it may as well bpf_sk_assign it to the skb. I don't have a use case for that but I also don't see why it won't work also.


For UDP, I'm not sure whether we even get into this situation? Doesn't
seem like UDP sockets are cloned from each other, so we also shouldn't
end up with a reuseport flag set erroneously.

Things we could do if necessary:
1. Reset the flag in inet_csk_clone_lock like we do for SOCK_RCU_FREE

I think we can't do this as sk_reuseport is inherited to twsk and used
in inet_bind_conflict().


2. Duplicate the cb check into inet[6]_steal_sock

or 3. Add sk_fullsock() test ?

yeah, probably adding sk_fullsock() is needed, may be something like(?):

    if (!prefetched || !sk_fullsock(sk))
                return sk;

Friendly ping. Thanks.