Re: [syzbot] [net?] [nfc?] KASAN: slab-use-after-free Read in nfc_alloc_send_skb

From: Siddh Raman Pant
Date: Thu Nov 16 2023 - 11:56:37 EST


TLDR: Different stages of 1 and 2 can race with each other causing UAF.

1. llcp_sock_sendmsg -> nfc_llcp_send_ui_frame -> loop call (nfc_alloc_send_skb(nfc_dev))

2. virtual_ncidev_close -> [... -> nfc_llcp_socket_release -> ...] -> [... -> nfc_free_device]

---

Hi,

I've been trying to fix this bug for some time but ending up getting
stuck every now and then. If someone could give more inputs or fix it,
it will be really helpful.

This bug is due to racing between sendmsg and freeing of nfc_dev.

For connectionless transmission, llcp_sock_sendmsg() codepath will
eventually call nfc_alloc_send_skb() which takes in an nfc_dev as
an argument for calculating the total size for skb allocation.

virtual_ncidev_close() codepath eventually releases socket by calling
nfc_llcp_socket_release() (which sets the sk->sk_state to LLCP_CLOSED)
and afterwards the nfc_dev will be eventually freed.

When an ndev gets freed, llcp_sock_sendmsg() will result in an
use-after-free as it

(1) doesn't have any checks in place for avoiding the datagram sending.
(1.1) Checking for LLCP_CLOSED in llcp_sock_sendmsg() does make
the racing less likely. For -smp 6 it did not trigger on
my PC, leading me to naively think that was the solution
until syzbot told me quite some time later that it isn't.

(2) calls nfc_llcp_send_ui_frame(), which also has a do-while loop which
can race with freeing (a msg with size of 4096 is sent in chunks of
128 in this repro).
(2.1) By this I mean just moving the nfc_dev access from
nfc_alloc_send_skb to inside this function, be it
inside or outside the loop, naturally doesn't work.

When an nfc_dev is freed and we happened to get headroom and tailroom,
PDU skb seems to be not allocated and ENXIO is returned.

I tried to look at other code in net subsystem to get an idea how other
places handle it, but accessing device later in the codepath does not
seem to not be a norm. So I am starting to think some refactoring of the
locking logic may be needed (or maybe RCU protect headroom and tailroom?).

I don't know if I'm correct, but anyways where does one start?

Thanks,
Siddh