Re: [syzbot] [batman?] BUG: soft lockup in sys_sendmsg

From: syzbot
Date: Mon Feb 12 2024 - 08:38:40 EST


> On Monday, 12 February 2024 11:26:24 CET syzbot wrote:
>> syzbot found the following issue on:
>>
>> HEAD commit: 41bccc98fb79 Linux 6.8-rc2
>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
>> console output: https://syzkaller.appspot.com/x/log.txt?x=14200118180000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=451a1e62b11ea4a6
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a6a4b5bb3da165594cff
>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>> userspace arch: arm64
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/0772069e29cf/disk-41bccc98.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/659d3f0755b7/vmlinux-41bccc98.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/7780a45c3e51/Image-41bccc98.gz.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+a6a4b5bb3da165594cff@xxxxxxxxxxxxxxxxxxxxxxxxx
>>
>
> #syz test

This crash does not have a reproducer. I cannot test it.

>
> From 5984ace8f8df7cf8d6f98ded0eebe7d962028992 Mon Sep 17 00:00:00 2001
> From: Sven Eckelmann <sven@xxxxxxxxxxxxx>
> Date: Mon, 12 Feb 2024 13:10:33 +0100
> Subject: [PATCH] batman-adv: Avoid infinite loop trying to resize local TT
>
> If the MTU of one of an attached interface becomes too small to transmit
> the local translation table then it must be resized to fit inside all
> fragments (when enabled) or a single packet.
>
> But if the MTU becomes too low to transmit even the header + the VLAN
> specific part then the resizing of the local TT will never succeed. This
> can for example happen when the usable space is 110 bytes and 11 VLANs are
> on top of batman-adv. In this case, at least 116 byte would be needed.
> There will just be an endless spam of
>
> batman_adv: batadv0: Forced to purge local tt entries to fit new maximum fragment MTU (110)
>
> in the log but the function will never finish. Problem here is that the
> timeout will be halved in each step and will then stagnate at 0 and
> therefore never be able to reduce the table even more.
>
> There are other scenarios possible with a similar result. The number of
> BATADV_TT_CLIENT_NOPURGE entries in the local TT can for example be too
> high to fit inside a packet. Such a scenario can therefore happen also with
> only a single VLAN + 7 non-purgable addresses - requiring at least 120
> bytes.
>
> While this should be handled proactively when:
>
> * interface with too low MTU is added
> * VLAN is added
> * non-purgeable local mac is added
> * MTU of an attached interface is reduced
> * fragmentation setting gets disabled (which most likely requires dropping
> attached interfaces)
>
> not all of these scenarios can be prevented because batman-adv is only
> consuming events without the the possibility to prevent these actions
> (non-purgable MAC address added, MTU of an attached interface is reduced).
> It is therefore necessary to also make sure that the code is able to handle
> also the situations when there were already incompatible system
> configurations present.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: a19d3d85e1b8 ("batman-adv: limit local translation table max size")
> Reported-by: syzbot+a6a4b5bb3da165594cff@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Sven Eckelmann <sven@xxxxxxxxxxxxx>
> ---
> net/batman-adv/translation-table.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
> index b95c36765d04..2243cec18ecc 100644
> --- a/net/batman-adv/translation-table.c
> +++ b/net/batman-adv/translation-table.c
> @@ -3948,7 +3948,7 @@ void batadv_tt_local_resize_to_mtu(struct net_device *soft_iface)
>
> spin_lock_bh(&bat_priv->tt.commit_lock);
>
> - while (true) {
> + while (timeout) {
> table_size = batadv_tt_local_table_transmit_size(bat_priv);
> if (packet_size_max >= table_size)
> break;
> --
> 2.39.2
>