Re: [PATCH v3] drivers: net: prevent tun_get_user() to exceed xdp size limits

From: Jesper Dangaard Brouer
Date: Thu Jul 27 2023 - 07:14:07 EST




On 27/07/2023 11.30, Paolo Abeni wrote:
On Thu, 2023-07-27 at 14:07 +0800, Jason Wang wrote:
On Thu, Jul 27, 2023 at 8:27 AM David Ahern <dsahern@xxxxxxxxx> wrote:

On 7/26/23 1:37 PM, David Ahern wrote:
On 7/26/23 3:02 AM, Jesper Dangaard Brouer wrote:
Cc. John and Ahern

On 26/07/2023 04.09, Jason Wang wrote:
On Tue, Jul 25, 2023 at 11:54 PM Andrew Kanner
<andrew.kanner@xxxxxxxxx> wrote:

Syzkaller reported the following issue:
=======================================
Too BIG xdp->frame_sz = 131072

Is this a contiguous physical memory allocation?

131072 bytes equal order 5 page.

Looking at tun.c code I cannot find a code path that could create
order-5 skb->data, but only SKB with order-0 fragments. But I guess it
is the netif_receive_generic_xdp() what will realloc to make this linear
(via skb_linearize())


get_tun_user is passed an iov_iter with a single segment of 65007
total_len. The alloc_skb path is hit with an align size of only 64. That
is insufficient for XDP so the netif_receive_generic_xdp hits the
pskb_expand_head path. Something is off in the math in
netif_receive_generic_xdp resulting in the skb markers being off. That
causes bpf_prog_run_generic_xdp to compute the wrong frame_sz.


BTW, it is pskb_expand_head that turns it from a 64kB to a 128 kB
allocation. But the 128kB part is not relevant to the "bug" here really.


True, it is another "bug"/unexpected-behavior that SKB gets reallocated
to be 128KiB. We should likely solve this in another patch.

The warn on getting tripped in bpf_xdp_adjust_tail is because xdp
generic path is skb based and can have a frame_sz > 4kB. That's what the
splat is about.

Agree, that the warn condition should be changed, even removed.
It is interesting that this warn caught this unexpected-behavior of
expanding to 128KiB.


Other possibility:

tun_can_build_skb() doesn't count XDP_PACKET_HEADROOM this may end up
with producing a frame_sz which is greater than PAGE_SIZE as well in
tun_build_skb().

True, and the way I read the tun_build_skb() code, via skb_page_frag_refill(),
it can produce an SKB with data size (buflen) upto order-3 = 32KiB (SKB_FRAG_PAGE_ORDER).

Thus, the existing check in tun_can_build_skb() for PAGE_SIZE can/should be relaxed?
(Please correct me as I don't fully understand tun_get_user() code)


And rethink this patch, it looks wrong since it basically drops all
packets whose buflen is greater than PAGE_SIZE since it can't fall
back to tun_alloc_skb().


I agree, this is why I reacted, as this version of the patch could
potentially cause issues and packet drops.


Perhaps the solution is to remove the WARN_ON.

Yes, that is what I'm asking if this warning still makes sense in V1.

I understand the consensus is solving the issue by changing/removing
the WARN_ON() in XDP. I think it makes sense, I guess the same warn can
be reached via packet socket xmit on veth or similar topology.


Yes, we can completely remove this check. The original intend was to
catch cases where XDP drivers have not been updated to use xdp.frame_sz,
but that is not longer a concern (since xdp_init_buff).

It was added (by me) in commit:
- c8741e2bfe87 ("xdp: Allow bpf_xdp_adjust_tail() to grow packet size")
- v5.8-rc1
- as part of merge 5cc5924d8315

I'm sure it is safe to remove since commit:
- 43b5169d8355 ("net, xdp: Introduce xdp_init_buff utility routine")
- v5.12-rc1

where we introduced xdp_init_buff() helper, which all XDP driver use today.
Question is what "Fixes:" tag should the patch have?

To Andrew, will you
(1) send a new patch that removes this check instead?
(2) have cycles to investigate why the unexpected-behavior of
expanding to 128KiB happens?

--Jesper