Re: [syzbot] upstream boot error: BUG: unable to handle kernel NULL pointer dereference in gic_eoi_irq

From: Marc Zyngier
Date: Fri May 12 2023 - 06:58:15 EST


On Thu, 11 May 2023 22:41:11 +0100,
syzbot <syzbot+afc1d968649e7e851562@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: ac9a78681b92 Linux 6.4-rc1
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=102a3f6a280000
> kernel config: https://syzkaller.appspot.com/x/.config?x=cc86fee67199911d
> dashboard link: https://syzkaller.appspot.com/bug?extid=afc1d968649e7e851562
> compiler: arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> userspace arch: arm
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/c35b5b2731d2/non_bootable_disk-ac9a7868.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/c04bec59d77d/vmlinux-ac9a7868.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/070113b307f3/zImage-ac9a7868.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+afc1d968649e7e851562@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> 8<--- cut here ---
> Unable to handle kernel NULL pointer dereference at virtual address 000005f4 when read
> [000005f4] *pgd=80000080004003, *pmd=00000000
> Internal error: Oops: 207 [#1] PREEMPT SMP ARM
> Modules linked in:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.4.0-rc1-syzkaller #0
> Hardware name: ARM-Versatile Express
> PC is at gic_eoi_irq+0x64/0x70 drivers/irqchip/irq-gic.c:228
> LR is at handle_percpu_devid_irq+0xb8/0x2d4 kernel/irq/chip.c:944
> pc : [<8087e328>] lr : [<802bf798>] psr: 20000193
> sp : df805f60 ip : df805f78 fp : df805f74
> r10: 00000000 r9 : 831f4680 r8 : 00000001
> r7 : 0000001c r6 : 81b0febc r5 : 000005f0 r4 : 8309a218
> r3 : 000005f0 r2 : 0009127a r1 : ddde8b00 r0 : 8309a218
> Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 30c5387d Table: 84804d80 DAC: 00000000
> Register r0 information: slab kmalloc-256 start 8309a200 pointer offset 24 size 256
> Register r1 information: non-slab/vmalloc memory
> Register r2 information:
> 8<--- cut here ---
> Unable to handle kernel NULL pointer dereference at virtual address 000001ff when read
> [000001ff] *pgd=80000080004003, *pmd=00000000
> Internal error: Oops: 207 [#2] PREEMPT SMP ARM
> Modules linked in:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.4.0-rc1-syzkaller #0
> Hardware name: ARM-Versatile Express
> PC is at __find_vmap_area mm/vmalloc.c:841 [inline]
> PC is at find_vmap_area mm/vmalloc.c:1862 [inline]
> PC is at find_vm_area mm/vmalloc.c:2623 [inline]
> PC is at vmalloc_dump_obj+0x38/0xb4 mm/vmalloc.c:4221
> LR is at __raw_spin_lock include/linux/spinlock_api_smp.h:132 [inline]
> LR is at _raw_spin_lock+0x18/0x58 kernel/locking/spinlock.c:154
> pc : [<8047a2ec>] lr : [<81801fd4>] psr: 20000193
> sp : df805df0 ip : df805dd8 fp : df805e04
> r10: 831f4680 r9 : 8261c9a4 r8 : 8285041c
> r7 : 60000193 r6 : 00000003 r5 : 00092000 r4 : 00000207
> r3 : 830e13a0 r2 : 00001dda r1 : 00000000 r0 : 00000001
> Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 30c5387d Table: 84804d80 DAC: 00000000

[hoping this will be read by a human and not one of these AI]

You keep sending me these reports because the GIC is in a
stacktrace.

But the root cause it probably somewhere else, as the multiple runs of
the same kernel result in very different exceptions, most of which
never reach the point where it explodes in your stacktrace. Here's one
of them:

[ 1.572514][ T1] Freeing unused kernel image (initmem) memory: 2048K
[ 1.624529][ T1] Failed to set sysctl parameter 'vm.nr_hugepages=4': parameter not found
[ 1.626239][ T1] Failed to set sysctl parameter 'vm.nr_overcommit_hugepages=4': parameter not found
[ 1.628105][ T1] Failed to set sysctl parameter 'max_rcu_stall_to_panic=1': parameter not found
[ 1.630034][ T1] Run /sbin/init as init process
[ 1.663886][ T0] Insufficient stack space to handle exception!
[ 1.663894][ T0] Task stack: [0xdf8a0000..0xdf8a2000]
[ 1.666697][ T0] IRQ stack: [0xdf804000..0xdf806000]
[ 1.668019][ T0] Overflow stack: [0x830eb000..0x830ec000]
[ 1.669327][ T0] Internal error: kernel stack overflow: 0 [#1] PREEMPT SMP ARM
[ 1.671033][ T0] Modules linked in:
[ 1.671894][ T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.4.0-rc1-syzkaller #0
[ 1.673749][ T0] Hardware name: ����x���df!��`!� system
[ 1.675020][ T0] PC is at __dabt_svc+0x14/0x60
[ 1.676176][ T0] LR is at arch_cpu_idle+0x38/0x3c
[ 1.677328][ T0] pc : [<80200a74>] lr : [<80208eb8>] psr: 00000193
[ 1.678918][ T0] sp : df8a0010 ip : df8a1f60 fp : df8a1f5c
[ 1.680268][ T0] r10: 00000000 r9 : 827e1666 r8 : 00000000
[ 1.681630][ T0] r7 : 8260c4e0 r6 : 00000001 r5 : 8260c498 r4 : 831cc680
[ 1.683282][ T0] r3 : 8021b8c0 r2 : 00433fc1 r1 : 81f9d24c r0 : 82850250
[ 1.684957][ T0] Flags: nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 1.686771][ T0] Control: 30c5383d Table: 80003000 DAC: dbadc0de
[ 1.688354][ T0] Register r0 information:
[ 1.713870][ T0] 8<--- cut here ---
[ 1.715765][ T0] Unhandled fault: unknown 3 (0xa03) at 0xdf8b9004
[ 1.717219][ T0] [df8b9004] *pgd=80000080007003, *pmd=83097003, *pte=802160e8fe83d71f
[ 2.073879][ C0] 8<--- cut here ---
[ 2.074748][ C0] Unable to handle kernel paging request at virtual address 830a2000 when execute
[ 2.076808][ C0] [830a2000] *pgd=80000080006003, *pmd=4000008300071d(bad)
[ 6.553998][ T0] Insufficient stack space to handle exception!
[ 6.554004][ T0] Task stack: [0xdf8a4000..0xdf8a6000]
[ 6.556644][ T0] IRQ stack: [0xdf808000..0xdf80a000]
[ 6.557922][ T0] Overflow stack: [0x830b8000..0x830b9000]
[ 18.824252][ T0] 8<--- cut here ---
[ 18.824265][ C4] 8<--- cut here ---
[ 18.824317][ T0] Insufficient stack space to handle exception!
[ 18.824320][ T0] Task stack: [0xdf8a8000..0xdf8aa000]
[ 18.824323][ T0] IRQ stack: [0xdf80c000..0xdf80e000]
[ 18.824326][ T0] Overflow stack: [0x830b9000..0x830ba000]
[ 18.825383][ T0] Unhandled fault: unknown 3 (0xa03) at 0xdf8b1004
[ 18.826484][ C4] Unable to handle kernel paging request at virtual address df84000c when read
[ 18.828182][ T0] [df8b1004] *pgd=80000080007003
[ 18.829838][ C4] [df84000c] *pgd=80000080007003
[ 18.831330][ T0] , *pmd=83097003
[ 18.832710][ C4] , *pmd=83097003
[ 18.834800][ T0] , *pte=8261d0a8fe83971f
[ 18.836939][ C4] , *pte=802160c4
[ 18.838162][ T0]
[ 18.843536][ C4]
[ 1.663886][ T0] Insufficient stack space to handle exception!
[ 1.663894][ T0] Task stack: [0xdf8a0000..0xdf8a2000]
[ 1.666697][ T0] IRQ stack: [0xdf804000..0xdf806000]
[ 1.668019][ T0] Overflow stack: [0x830eb000..0x830ec000]
[ 1.669327][ T0] Internal error: kernel stack overflow: 0 [#1] PREEMPT SMP ARM
[ 1.671033][ T0] Modules linked in:
[ 1.671894][ T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.4.0-rc1-syzkaller #0
[ 1.673749][ T0] Hardware name: ����x���df!��`!� system
[ 1.675020][ T0] PC is at __dabt_svc+0x14/0x60
[ 1.676176][ T0] LR is at arch_cpu_idle+0x38/0x3c
[ 1.677328][ T0] pc : [<80200a74>] lr : [<80208eb8>] psr: 00000193
[ 1.678918][ T0] sp : df8a0010 ip : df8a1f60 fp : df8a1f5c
[ 1.680268][ T0] r10: 00000000 r9 : 827e1666 r8 : 00000000
[ 1.681630][ T0] r7 : 8260c4e0 r6 : 00000001 r5 : 8260c498 r4 : 831cc680
[ 1.683282][ T0] r3 : 8021b8c0 r2 : 00433fc1 r1 : 81f9d24c r0 : 82850250
[ 1.684957][ T0] Flags: nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 1.686771][ T0] Control: 30c5383d Table: 80003000 DAC: dbadc0de
[ 1.688354][ T0] Register r0 information:
[ 1.713870][ T0] 8<--- cut here ---
[ 1.715765][ T0] Unhandled fault: unknown 3 (0xa03) at 0xdf8b9004
[ 1.717219][ T0] [df8b9004] *pgd=80000080007003, *pmd=83097003, *pte=802160e8fe83d71f
[ 2.073879][ C0] 8<--- cut here ---
[ 2.074748][ C0] Unable to handle kernel paging request at virtual address 830a2000 when execute
[ 2.076808][ C0] [830a2000] *pgd=80000080006003, *pmd=4000008300071d(bad)
[ 6.553998][ T0] Insufficient stack space to handle exception!
[ 6.554004][ T0] Task stack: [0xdf8a4000..0xdf8a6000]
[ 6.556644][ T0] IRQ stack: [0xdf808000..0xdf80a000]
[ 6.557922][ T0] Overflow stack: [0x830b8000..0x830b9000]
[ 18.824252][ T0] 8<--- cut here ---
[ 18.824265][ C4] 8<--- cut here ---
[ 18.824317][ T0] Insufficient stack space to handle exception!
[ 18.824320][ T0] Task stack: [0xdf8a8000..0xdf8aa000]
[ 18.824323][ T0] IRQ stack: [0xdf80c000..0xdf80e000]
[ 18.824326][ T0] Overflow stack: [0x830b9000..0x830ba000]
[ 18.825383][ T0] Unhandled fault: unknown 3 (0xa03) at 0xdf8b1004
[ 18.826484][ C4] Unable to handle kernel paging request at virtual address df84000c when read
[ 18.828182][ T0] [df8b1004] *pgd=80000080007003
[ 18.829838][ C4] [df84000c] *pgd=80000080007003
[ 18.831330][ T0] , *pmd=83097003
[ 18.832710][ C4] , *pmd=83097003
[ 18.834800][ T0] , *pte=8261d0a8fe83971f
[ 18.836939][ C4] , *pte=802160c4
[ 18.838162][ T0]
[ 18.843536][ C4]

So not much to do with the GIC, but more to do with general stack
overflow/corruption. I'd appreciate it if you could stop barking up
the wrong tree and get someone who is still interested in 32bit ARM to
look into it.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.