[paulmck-rcu:dev.2023.09.15a] [locktorture] 31f96f3c93: BUG:kernel_NULL_pointer_dereference,address

From: kernel test robot
Date: Wed Sep 20 2023 - 10:36:33 EST




Hello,

kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: 31f96f3c93f79c901c06a805e8e23383d0cd4a6c ("locktorture: Dump CPUs running writer tasks when RCU stalls")
https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2023.09.15a

in testcase: boot

compiler: gcc-7
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202309202238.bf86a734-oliver.sang@xxxxxxxxx


[ 163.797412][ C1] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 163.797412][ C1] #PF: supervisor read access in kernel mode
[ 163.805375][ C1] #PF: error_code(0x0000) - not-present page
[ 163.805375][ C1] PGD 0 P4D 0
[ 163.805375][ C1] Oops: 0000 [#1] SMP
[ 163.805375][ C1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.6.0-rc1-00062-g31f96f3c93f7 #1
[ 163.805375][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 163.813385][ C1] RIP: 0010:_find_next_bit (lib/find_bit.c:133 (discriminator 2))
[ 163.813385][ C1] Code: fd 40 0f b6 f6 48 c7 c7 a0 13 6d 83 e8 98 01 92 ff 4c 39 e3 76 5e 4c 89 e2 48 c7 c6 ff ff ff ff 44 89 e1 48 c1 ea 06 48 d3 e6 <48> 23 74 d5 00 75 30 48 83 c2 01 48 89 d1 48 c1 e1 06 48 39 d9 73
All code
========
0: fd std
1: 40 0f b6 f6 movzbl %sil,%esi
5: 48 c7 c7 a0 13 6d 83 mov $0xffffffff836d13a0,%rdi
c: e8 98 01 92 ff call 0xffffffffff9201a9
11: 4c 39 e3 cmp %r12,%rbx
14: 76 5e jbe 0x74
16: 4c 89 e2 mov %r12,%rdx
19: 48 c7 c6 ff ff ff ff mov $0xffffffffffffffff,%rsi
20: 44 89 e1 mov %r12d,%ecx
23: 48 c1 ea 06 shr $0x6,%rdx
27: 48 d3 e6 shl %cl,%rsi
2a:* 48 23 74 d5 00 and 0x0(%rbp,%rdx,8),%rsi <-- trapping instruction
2f: 75 30 jne 0x61
31: 48 83 c2 01 add $0x1,%rdx
35: 48 89 d1 mov %rdx,%rcx
38: 48 c1 e1 06 shl $0x6,%rcx
3c: 48 39 d9 cmp %rbx,%rcx
3f: 73 .byte 0x73

Code starting with the faulting instruction
===========================================
0: 48 23 74 d5 00 and 0x0(%rbp,%rdx,8),%rsi
5: 75 30 jne 0x37
7: 48 83 c2 01 add $0x1,%rdx
b: 48 89 d1 mov %rdx,%rcx
e: 48 c1 e1 06 shl $0x6,%rcx
12: 48 39 d9 cmp %rbx,%rcx
15: 73 .byte 0x73
[ 163.813385][ C1] RSP: 0000:ffff88842fc05dc8 EFLAGS: 00010046
[ 163.813385][ C1] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
[ 163.813385][ C1] RDX: 0000000000000000 RSI: ffffffffffffffff RDI: ffffffff836d13a0
[ 163.813385][ C1] RBP: 0000000000000000 R08: 0000000000000010 R09: 0000000000000002
[ 163.813385][ C1] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 163.826802][ C1] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff84ce1520
[ 163.826802][ C1] FS: 0000000000000000(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000
[ 163.826802][ C1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 163.833375][ C1] CR2: 0000000000000000 CR3: 0000000002ea4000 CR4: 00000000000406a0
[ 163.833375][ C1] Call Trace:
[ 163.833375][ C1] <IRQ>
[ 163.833375][ C1] ? __die_body (arch/x86/kernel/dumpstack.c:421)
[ 163.833375][ C1] ? page_fault_oops (arch/x86/mm/fault.c:702)
[ 163.833375][ C1] ? kernelmode_fixup_or_oops+0x103/0x140
[ 163.841375][ C1] ? exc_page_fault (arch/x86/include/asm/irqflags.h:26 arch/x86/include/asm/irqflags.h:67 arch/x86/include/asm/irqflags.h:127 arch/x86/mm/fault.c:1513 arch/x86/mm/fault.c:1561)
[ 163.841375][ C1] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:570)
[ 163.841375][ C1] ? _find_next_bit (lib/find_bit.c:133 (discriminator 2))
[ 163.841375][ C1] ? _find_next_bit (lib/find_bit.c:133)
[ 163.841375][ C1] torture_spin_lock_dump (kernel/locking/locktorture.c:295 (discriminator 1))
[ 163.841375][ C1] notifier_call_chain (kernel/notifier.c:95)
[ 163.849372][ C1] atomic_notifier_call_chain (kernel/notifier.c:231)
[ 163.849372][ C1] rcu_sched_clock_irq (kernel/rcu/tree_stall.h:791 kernel/rcu/tree.c:3875 kernel/rcu/tree.c:2253)
[ 163.849372][ C1] ? account_system_index_time (include/linux/cgroup.h:423 include/linux/cgroup.h:492 include/linux/cgroup.h:733 kernel/sched/cputime.c:113 kernel/sched/cputime.c:176)
[ 163.849372][ C1] ? tick_sched_handle+0x70/0x70
[ 163.849372][ C1] update_process_times (arch/x86/include/asm/preempt.h:27 kernel/time/timer.c:2073)
[ 163.857379][ C1] tick_sched_timer (kernel/time/tick-sched.c:1492)
[ 163.857379][ C1] __hrtimer_run_queues (kernel/time/hrtimer.c:1688 kernel/time/hrtimer.c:1752)
[ 163.857379][ C1] hrtimer_interrupt (kernel/time/hrtimer.c:1817)
[ 163.857379][ C1] __sysvec_apic_timer_interrupt (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1081)
[ 163.857379][ C1] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
[ 163.857379][ C1] </IRQ>
[ 163.857379][ C1] <TASK>
[ 163.865377][ C1] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645)
[ 163.865377][ C1] RIP: 0010:_raw_spin_unlock_irqrestore (kernel/locking/spinlock.c:195)
[ 163.865377][ C1] Code: 0f 95 c6 48 89 c5 31 c9 31 d2 40 0f b6 f6 e8 0e 6e 17 ff 48 85 ed 74 05 e8 94 9d fe ff 48 85 db 74 01 fb 65 ff 0d 3f 20 0a 7e <5b> 5d c3 0f 1f 40 00 e8 5b 64 fb fe 53 48 89 fb 65 ff 05 28 20 0a
All code
========
0: 0f 95 c6 setne %dh
3: 48 89 c5 mov %rax,%rbp
6: 31 c9 xor %ecx,%ecx
8: 31 d2 xor %edx,%edx
a: 40 0f b6 f6 movzbl %sil,%esi
e: e8 0e 6e 17 ff call 0xffffffffff176e21
13: 48 85 ed test %rbp,%rbp
16: 74 05 je 0x1d
18: e8 94 9d fe ff call 0xfffffffffffe9db1
1d: 48 85 db test %rbx,%rbx
20: 74 01 je 0x23
22: fb sti
23: 65 ff 0d 3f 20 0a 7e decl %gs:0x7e0a203f(%rip) # 0x7e0a2069
2a:* 5b pop %rbx <-- trapping instruction
2b: 5d pop %rbp
2c: c3 ret
2d: 0f 1f 40 00 nopl 0x0(%rax)
31: e8 5b 64 fb fe call 0xfffffffffefb6491
36: 53 push %rbx
37: 48 89 fb mov %rdi,%rbx
3a: 65 gs
3b: ff .byte 0xff
3c: 05 .byte 0x5
3d: 28 20 sub %ah,(%rax)
3f: 0a .byte 0xa

Code starting with the faulting instruction
===========================================
0: 5b pop %rbx
1: 5d pop %rbp
2: c3 ret
3: 0f 1f 40 00 nopl 0x0(%rax)
7: e8 5b 64 fb fe call 0xfffffffffefb6467
c: 53 push %rbx
d: 48 89 fb mov %rdi,%rbx
10: 65 gs
11: ff .byte 0xff
12: 05 .byte 0x5
13: 28 20 sub %ah,(%rax)
15: 0a .byte 0xa


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230920/202309202238.bf86a734-oliver.sang@xxxxxxxxx



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki