Re: KASAN slab-use-after-free in __lock_acquire

From: Peter Zijlstra
Date: Tue Nov 07 2023 - 11:58:52 EST


On Wed, Nov 01, 2023 at 07:22:31PM +0000, Bai, Shuangpeng wrote:
> Dear Kernel Maintainers,
>
> We found a new kernel bug. Please see the details below.
>
> A slab-use-after-free bug can be triggered when kernel dereferences a task pointer that has been freed in function delayed_put_task_struct before.
> I'm sorry for bothering you if this bug is not related to you. I will appreciate if you help me find the person responsible. Thank you!
>
> Kenrel commit: 8bc9e6515183935fa0cccaf67455c439afe4982b (recent upstream)
> Kernel config: attachment
> C/Syz reproducer: attachment
>
> [ 314.465397][ C1] ==================================================================
> [ 314.467080][ C1] BUG: KASAN: slab-use-after-free in __lock_acquire (kernel/locking/lockdep.c:5004)
> [ 314.469666][ C1] Read of size 8 at addr ffff88801bd9ad08 by task systemd-udevd/8228
> [ 314.471271][ C1]
> [ 314.471719][ C1] CPU: 1 PID: 8228 Comm: systemd-udevd Not tainted 6.6.0-06824-g8bc9e6515183 #4
> [ 314.473512][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [ 314.475321][ C1] Call Trace:
> [ 314.475991][ C1] <IRQ>
> [ 314.476576][ C1] dump_stack_lvl (lib/dump_stack.c:107)
> [ 314.478518][ C1] print_report (mm/kasan/report.c:365 mm/kasan/report.c:475)
> [ 314.479423][ C1] ? __virt_addr_valid (arch/x86/mm/physaddr.c:66)
> [ 314.480440][ C1] ? __phys_addr (arch/x86/mm/physaddr.c:32 (discriminator 4))
> [ 314.481348][ C1] ? __lock_acquire (kernel/locking/lockdep.c:5004)
> [ 314.482328][ C1] kasan_report (mm/kasan/report.c:590)
> [ 314.483164][ C1] ? __lock_acquire (kernel/locking/lockdep.c:5004)
> [ 314.484185][ C1] __lock_acquire (kernel/locking/lockdep.c:5004)
> [ 314.485199][ C1] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4992)
> [ 314.486418][ C1] ? lockdep_unlock (kernel/locking/lockdep.c:157)
> [ 314.487390][ C1] ? __lock_acquire (kernel/locking/lockdep.c:186 kernel/locking/lockdep.c:3872 kernel/locking/lockdep.c:5136)
> [ 314.488411][ C1] lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5755 kernel/locking/lockdep.c:5718)
> [ 314.489329][ C1] ? try_to_wake_up (kernel/sched/core.c:4049 kernel/sched/core.c:4228)
> [ 314.490296][ C1] ? lock_sync (kernel/locking/lockdep.c:5721)
> [ 314.491182][ C1] ? __lock_acquire (./arch/x86/include/asm/bitops.h:228 ./arch/x86/include/asm/bitops.h:240 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 kernel/locking/lockdep.c:228 kernel/locking/lockdep.c:3780 kernel/locking/lockdep.c:3836 kernel/locking/lockdep.c:5136)
> [ 314.492196][ C1] ? _raw_spin_lock_irqsave (./include/linux/spinlock_api_smp.h:108 kernel/locking/spinlock.c:162)
> [ 314.493263][ C1] ? nilfs_segctor_zeropad_segsum (fs/nilfs2/segment.c:2441)
> [ 314.494511][ C1] _raw_spin_lock_irqsave (./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
> [ 314.495579][ C1] ? try_to_wake_up (kernel/sched/core.c:4049 kernel/sched/core.c:4228)
> [ 314.496572][ C1] try_to_wake_up (kernel/sched/core.c:4049 kernel/sched/core.c:4228)
> [ 314.497480][ C1] ? sched_ttwu_pending (kernel/sched/core.c:4196)
> [ 314.498493][ C1] ? do_raw_spin_unlock (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:457 ./include/linux/atomic/atomic-instrumented.h:33 ./include/asm-generic/qspinlock.h:57 kernel/locking/spinlock_debug.c:100 kernel/locking/spinlock_debug.c:140)
> [ 314.499518][ C1] ? nilfs_segctor_zeropad_segsum (fs/nilfs2/segment.c:2441)
> [ 314.500737][ C1] call_timer_fn (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127 kernel/time/timer.c:1701)

For some reason the backtrace fails to mention which timer function is
called. Having that would be helpful.