Re: [syzbot] [net?] WARNING in mpls_gso_segment

From: Florian Westphal
Date: Wed Feb 21 2024 - 08:16:35 EST


syzbot <syzbot+99d15fcdb0132a1e1a82@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1536462c180000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/adbf5d8e38d7/disk-49344462.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/0f8e3fb78410/vmlinux-49344462.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/682f4814bf23/bzImage-49344462.xz
>
> The issue was bisected to:
>
> commit 219eee9c0d16f1b754a8b85275854ab17df0850a
> Author: Florian Westphal <fw@xxxxxxxxx>
> Date: Fri Feb 16 11:36:57 2024 +0000
>
> net: skbuff: add overflow debug check to pull/push helpers
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13262752180000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=10a62752180000
> console output: https://syzkaller.appspot.com/x/log.txt?x=17262752180000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+99d15fcdb0132a1e1a82@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: 219eee9c0d16 ("net: skbuff: add overflow debug check to pull/push helpers")
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 5068 at include/linux/skbuff.h:2723 pskb_may_pull_reason include/linux/skbuff.h:2723 [inline]
> WARNING: CPU: 0 PID: 5068 at include/linux/skbuff.h:2723 pskb_may_pull include/linux/skbuff.h:2739 [inline]
> WARNING: CPU: 0 PID: 5068 at include/linux/skbuff.h:2723 mpls_gso_segment+0x773/0xaa0 net/mpls/mpls_gso.c:34

Two possible solutions:

1.)

diff --git a/net/mpls/mpls_gso.c b/net/mpls/mpls_gso.c
index 533d082f0701..43801b78dd64 100644
--- a/net/mpls/mpls_gso.c
+++ b/net/mpls/mpls_gso.c
@@ -25,12 +25,13 @@ static struct sk_buff *mpls_gso_segment(struct sk_buff *skb,
netdev_features_t mpls_features;
u16 mac_len = skb->mac_len;
__be16 mpls_protocol;
- unsigned int mpls_hlen;
+ int mpls_hlen;

skb_reset_network_header(skb);
mpls_hlen = skb_inner_network_header(skb) - skb_network_header(skb);
- if (unlikely(!mpls_hlen || mpls_hlen % MPLS_HLEN))
+ if (unlikely(mpls_hlen <= 0 || mpls_hlen % MPLS_HLEN))
goto out;
+
if (unlikely(!pskb_may_pull(skb, mpls_hlen)))
goto out;

(or a variation thereof).

2) revert the pskb_may_pull_reason change added in 219eee9c0d16f1b754a8 to
make it tolerant to "negative" (huge) may-pull requests again.

With above repro, skb_inner_network_header() yields 0, skb_network_header()
returns 108, so we "pskb_may_pull(skb, -108)))" which now triggers
DEBUG_NET_WARN_ON_ONCE() check.

Before blamed commit, this would make pskb_may_pull hit:

if (unlikely(len > skb->len))
return SKB_DROP_REASON_PKT_TOO_SMALL;

and mpls_gso_segment takes the 'goto out' label.

So question is really if we should fix this in mpls_gso (and possible others
that try to pull negative numbers...) or if we should legalize this, either by
adding explicit if (unlikely(len > INT_MAX)) test to pskb_may_pull_reason or
by adding a comment that negative 'len' numbers are expected to be caught by
the check vs. skb->len.

Opinions?