Re: [PATCH net] udp: fix segmentation crash for untrusted source packet

From: Lena Wang (王娜)
Date: Fri Mar 15 2024 - 04:13:15 EST


On Wed, 2024-03-13 at 16:41 +0100, Paolo Abeni wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On Wed, 2024-03-13 at 21:34 +0800, Shiming Cheng wrote:
> > Kernel exception is reported when making udp frag list
> segmentation.
> > Backtrace is as below:
> > at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/ipv4/udp_offload.c:229
> > at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/ipv4/udp_offload.c:262
> > features=features@entry=19, is_ipv6=false)
> > at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/ipv4/udp_offload.c:289
> > features=19)
> > at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/ipv4/udp_offload.c:399
> > features=19)
> > at out/android15-6.6/kernel-6.6/kernel-
> 6.6/net/ipv4/af_inet.c:1418
> > skb@entry=0x0, features=19, features@entry=0)
> > at out/android15-6.6/kernel-6.6/kernel-6.6/net/core/gso.c:53
> > tx_path=<optimized out>)
> > at out/android15-6.6/kernel-6.6/kernel-6.6/net/core/gso.c:124
>
> A full backtrace would help better understanding the issue.

Below is full backtrace:
[ 1100.812205][ C3] CPU: 3 PID: 0 Comm: swapper/3 Tainted:
G W OE 6.6.17-android15-0-g380371ea9bf1 #1
[ 1100.812211][ C3] Hardware name: MT6991(ENG) (DT)
[ 1100.812215][ C3] Call trace:
[ 1100.812218][ C3] dump_backtrace+0xec/0x138
[ 1100.812222][ C3] show_stack+0x18/0x24
[ 1100.812226][ C3] dump_stack_lvl+0x50/0x6c
[ 1100.812232][ C3] dump_stack+0x18/0x24
[ 1100.812237][ C3] mrdump_common_die+0x24c/0x388 [mrdump]
[ 1100.812259][ C3] ipanic_die+0x20/0x34 [mrdump]
[ 1100.812269][ C3] notifier_call_chain+0x90/0x174
[ 1100.812275][ C3] notify_die+0x50/0x8c
[ 1100.812279][ C3] die+0x94/0x308
[ 1100.812283][ C3] __do_kernel_fault+0x240/0x26c
[ 1100.812288][ C3] do_page_fault+0xa0/0x48c
[ 1100.812293][ C3] do_translation_fault+0x38/0x54
[ 1100.812297][ C3] do_mem_abort+0x58/0x104
[ 1100.812302][ C3] el1_abort+0x3c/0x5c
[ 1100.812307][ C3] el1h_64_sync_handler+0x54/0x90
[ 1100.812313][ C3] el1h_64_sync+0x68/0x6c
[ 1100.812317][ C3] __udp_gso_segment+0x298/0x4d4
[ 1100.812322][ C3] udp4_ufo_fragment+0x130/0x174
[ 1100.812326][ C3] inet_gso_segment+0x164/0x330
[ 1100.812330][ C3] skb_mac_gso_segment+0xc4/0x13c
[ 1100.812335][ C3] __skb_gso_segment+0xc4/0x120
[ 1100.812339][ C3] udp_rcv_segment+0x50/0x134
[ 1100.812344][ C3] udp_queue_rcv_skb+0x74/0x114
[ 1100.812348][ C3] udp_unicast_rcv_skb+0x94/0xac
[ 1100.812353][ C3] __udp4_lib_rcv+0x3e0/0x818
[ 1100.812358][ C3] udp_rcv+0x20/0x30
[ 1100.812362][ C3] ip_protocol_deliver_rcu+0x194/0x368
[ 1100.812368][ C3] ip_local_deliver+0xe4/0x184
[ 1100.812373][ C3] ip_rcv+0x90/0x118
[ 1100.812378][ C3] __netif_receive_skb+0x74/0x124
[ 1100.812383][ C3] process_backlog+0xd8/0x18c
[ 1100.812388][ C3] __napi_poll+0x5c/0x1fc
[ 1100.812392][ C3] net_rx_action+0x150/0x334
[ 1100.812397][ C3] __do_softirq+0x120/0x3f4
[ 1100.812401][ C3] ____do_softirq+0x10/0x20
[ 1100.812405][ C3] call_on_irq_stack+0x3c/0x74
[ 1100.812410][ C3] do_softirq_own_stack+0x1c/0x2c
[ 1100.812414][ C3] __irq_exit_rcu+0x5c/0xd4
[ 1100.812418][ C3] irq_exit_rcu+0x10/0x1c
[ 1100.812422][ C3] el1_interrupt+0x38/0x58
[ 1100.812428][ C3] el1h_64_irq_handler+0x18/0x24
[ 1100.812434][ C3] el1h_64_irq+0x68/0x6c
[ 1100.812437][ C3] arch_local_irq_enable+0x4/0x8
[ 1100.812443][ C3] cpuidle_enter+0x38/0x54
[ 1100.812449][ C3] do_idle+0x198/0x294
[ 1100.812454][ C3] cpu_startup_entry+0x34/0x3c
[ 1100.812459][ C3] secondary_start_kernel+0x138/0x158
[ 1100.812465][ C3] __secondary_switched+0xc0/0xc4

> > This packet's frag list is null while gso_type is not 0. Then it is
> treated
> > as a GRO-ed packet and sent to segment frag list. Function call
> path is
> > udp_rcv_segment => config features value
> > __udpv4_gso_segment => skb_gso_ok returns false. Here it
> should be
> > true.
>
> Why? If I read correctly the above, this is GSO packet landing in an
> UDP socket with no UDP_GRO sockopt. The packet is expected to be
> segmented again.
>
Yes, it is GSO packet, however the fragment list of this GSO packet
becomes NULL. As the occurrence rate is very low, we really don’t know
why and when it becomes to be NULL. It happens both in cellular and
wlan network and seems an unknown kernel issue.

To avoid crash the packet should skip to be segmented when fraglist is
null.

> >Failed reason is features doesn't
> match
> > gso_type.
> > __udp_gso_segment_list
> > skb_segment_list => packet is linear with skb->next =
> NULL
> > __udpv4_gso_segment_list_csum => use skb->next directly
> and
> > crash happens
> >
> > In rx-gro-list GRO-ed packet is set gso type as
> > NETIF_F_GSO_UDP_L4 | NETIF_F_GSO_FRAGLIST in napi_gro_complete. In
> gso
> > flow the features should also set them to match with gso_type. Or
> else it
> > will always return false in skb_gso_ok. Then it can't discover the
> > untrusted source packet and result crash in following function.
>
> What is the 'untrusted source' here? I read the above as the packet
> aggregation happened in the GRO engine???
>
> Could you please give a complete description of the relevant
> scenario?
>

According to the backtrace info, we infer it is a rx-frag_list GRO
packet. Before sending into the UDP socket with no UDP_GRO sockopt, it
seems enter "skb_condense" to trim it and loose his frag list. However
it still keeps gso_type and gso_size. Then it continues to do
skb_segment_list.

First crash happens in skb_segment_list.
This patch resolves the crash and lets the packet becomes a skb without
skb->next:
https://lore.kernel.org/all/Y9gt5EUizK1UImEP@debian/
Then crash moves to __udp_gso_sement_list -> skb_segment_list(finish)
-> __udpv4_gso_segment_list_csum, it uses skb->next without check then
crash.


What we want to do is to drop this abnormal packet. So we set features
NETIF_F_GSO_UDP_L4 |NETIF_F_GSO_FRAGLIST to match fixes: f2696099c6c6
condation then drop it.

Another solution is we can check the frag_list directly in
skb_segment_lsit:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 011d690..5b054f0 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4336,6 +4336,8 @@
int len_diff, err;

skb_push(skb, -skb_network_offset(skb) + offset);
+ if (!list_skb)
+ goto err_linearize;

/* Ensure the head is writeable before touching the shared info */
err = skb_unclone(skb, GFP_ATOMIC);

> > Fixes: f2696099c6c6 ("udp: Avoid post-GRO UDP checksum
> recalculation")
> > Signed-off-by: Shiming Cheng <shiming.cheng@xxxxxxxxxxxx>
> > Signed-off-by: Lena Wang <lena.wang@xxxxxxxxxxxx>
> > ---
> > include/net/udp.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/net/udp.h b/include/net/udp.h
> > index 488a6d2babcc..c87baa23b9da 100644
> > --- a/include/net/udp.h
> > +++ b/include/net/udp.h
> > @@ -464,7 +464,7 @@ void udpv6_encap_enable(void);
> > static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
> > struct sk_buff *skb, bool ipv4)
> > {
> > -netdev_features_t features = NETIF_F_SG;
> > +netdev_features_t features = NETIF_F_SG | NETIF_F_GSO_UDP_L4 |
> NETIF_F_GSO_FRAGLIST;
>
> This looks wrong: real UDP_L4 GSO packets will not segmented anymore
> and should be dropped (?!?)
>
>