[linus:master] [workqueue] 8a1dd1e547: BUG:KCSAN:data-race_in_print_report/wq_worker_tick

From: kernel test robot
Date: Mon Sep 25 2023 - 03:20:03 EST




Hello,

kernel test robot noticed "BUG:KCSAN:data-race_in_print_report/wq_worker_tick" on:

commit: 8a1dd1e547c1a037692e7a6da6a76108108c72b1 ("workqueue: Track and monitor per-workqueue CPU time usage")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master 2cf0f715623872823a72e451243bbf555d10d032]
[test failed on linux-next/master 29e400e3ea486bf942b214769fc9778098114113]

in testcase: boot

compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we found various KCSAN issues happen on both parent/fbc, but below (1)
only happens on this commit and clean on parent after almost 1000 runs.
we don't have enough knowledge how connect the code change to the issue,
just report what we observed in our tests FYI.

6363845005202148 8a1dd1e547c1a037692e7a6da6a
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
956:986 -0% 955:981 dmesg.BUG:KCSAN:data-race_in__mutex_add_waiter/print_report
4:986 -0% 1:981 dmesg.BUG:KCSAN:data-race_in__mutex_remove_waiter/print_report
1:986 -0% :981 dmesg.BUG:KCSAN:data-race_in_do_timer/print_report
1:986 0% 2:981 dmesg.BUG:KCSAN:data-race_in_enqueue_timer/print_report
4:986 -0% 1:981 dmesg.BUG:KCSAN:data-race_in_print_report/process_one_work
5:986 -0% 4:981 dmesg.BUG:KCSAN:data-race_in_print_report/threadfunc
4:986 0% 4:981 dmesg.BUG:KCSAN:data-race_in_print_report/tick_nohz_handler
175:986 2% 199:981 dmesg.BUG:KCSAN:data-race_in_print_report/tick_nohz_stop_tick
:986 2% 24:981 dmesg.BUG:KCSAN:data-race_in_print_report/wq_worker_tick <----- (1)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202309251423.8ef230c0-oliver.sang@xxxxxxxxx


[ 121.059354][ C0] ==================================================================
[ 121.060132][ C0] BUG: KCSAN: data-race in print_report / wq_worker_tick
[ 121.060132][ C0]
[ 121.060132][ C0] write to 0xffff88810336eaa8 of 8 bytes by interrupt on cpu 1:
[ 121.060132][ C0] wq_worker_tick (kernel/workqueue.c:1140)
[ 121.060132][ C0] scheduler_tick (kernel/sched/core.c:5639)
[ 121.060132][ C0] update_process_times (kernel/time/timer.c:2078)
[ 121.060132][ C0] tick_nohz_handler (kernel/time/tick-sched.c:1369)
[ 121.060132][ C0] __sysvec_apic_timer_interrupt (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1113)
[ 121.060132][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1106 (discriminator 14))
[ 121.060132][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645)
[ 121.060132][ C0] kcsan_setup_watchpoint (kernel/kcsan/core.c:704 (discriminator 7))
[ 121.060132][ C0] do_raw_spin_unlock (kernel/locking/spinlock_debug.c:99 kernel/locking/spinlock_debug.c:140)
[ 121.060132][ C0] _raw_spin_unlock (arch/x86/include/asm/preempt.h:85 include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:186)
[ 121.060132][ C0] __ww_mutex_lock+0x7fb/0xcfd
[ 121.060132][ C0] ww_mutex_lock (kernel/locking/mutex.c:873)
[ 121.060132][ C0] stress_one_work (kernel/locking/test-ww_mutex.c:545)
[ 121.060132][ C0] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2589)
[ 121.060132][ C0] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2736)
[ 121.060132][ C0] kthread (kernel/kthread.c:381)
[ 121.060132][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 121.060132][ C0]
[ 121.060132][ C0] read to 0xffff88810336eaa8 of 8 bytes by interrupt on cpu 0:
[ 121.060132][ C0] print_report (kernel/kcsan/report.c:396)
[ 121.060132][ C0] kcsan_report_known_origin (kernel/kcsan/report.c:504 kernel/kcsan/report.c:694)
[ 121.060132][ C0] kcsan_setup_watchpoint (kernel/kcsan/core.c:678)
[ 121.060132][ C0] wq_worker_tick (kernel/workqueue.c:1140)
[ 121.060132][ C0] scheduler_tick (kernel/sched/core.c:5639)
[ 121.060132][ C0] update_process_times (kernel/time/timer.c:2078)
[ 121.060132][ C0] tick_nohz_handler (kernel/time/tick-sched.c:1369)
[ 121.060132][ C0] __sysvec_apic_timer_interrupt (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1113)
[ 121.060132][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1106 (discriminator 14))
[ 121.060132][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645)
[ 121.060132][ C0] queued_spin_lock_slowpath (include/linux/instrumented.h:69 include/linux/atomic/atomic-instrumented.h:27 arch/x86/include/asm/qspinlock.h:100 kernel/locking/qspinlock.c:327)
[ 121.060132][ C0] do_raw_spin_lock (kernel/locking/spinlock_debug.c:93 kernel/locking/spinlock_debug.c:117)
[ 121.060132][ C0] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 121.060132][ C0] __ww_mutex_lock+0x941/0xcfd
[ 121.060132][ C0] ww_mutex_lock (kernel/locking/mutex.c:873)
[ 121.060132][ C0] stress_reorder_work (kernel/locking/test-ww_mutex.c:505)
[ 121.060132][ C0] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2589)
[ 121.060132][ C0] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2736)
[ 121.060132][ C0] kthread (kernel/kthread.c:381)
[ 121.060132][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 121.060132][ C0]
[ 121.060132][ C0] value changed: 0x00000000001364d8 -> 0x00000000001368c0
[ 121.060132][ C0]
[ 121.060132][ C0] Reported by Kernel Concurrency Sanitizer on:
[ 121.060132][ C0] CPU: 0 PID: 42 Comm: kworker/u4:3 Tainted: G T 6.4.0-rc1-00009-g8a1dd1e547c1 #1
[ 121.060132][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 121.060132][ C0] Workqueue: test-ww_mutex stress_reorder_work
[ 121.060132][ C0] ==================================================================
[ 121.932570][ T1] All ww mutex selftests passed
[ 121.942977][ T1] Initialise system trusted keyrings
[ 121.950392][ T1] workingset: timestamp_bits=46 max_order=22 bucket_order=0
[ 121.975697][ T1] NET: Registered PF_ALG protocol family
[ 121.977547][ T1] Key type asymmetric registered
[ 121.979060][ T1] Asymmetric key parser 'x509' registered
[ 121.980897][ T1] Asymmetric key parser 'pkcs8' registered
[ 121.982622][ T1]
[ 121.982622][ T1] Start testing find_bit() with random-filled bitmap
[ 122.008302][ T1] find_next_bit: 8628942 ns, 163936 iterations
[ 122.020467][ T1] find_next_zero_bit: 9789528 ns, 163745 iterations
[ 122.032835][ T1] find_last_bit: 10001837 ns, 163935 iterations
[ 122.248386][ T1] find_nth_bit: 213810108 ns, 16175 iterations
[ 122.437809][ T1] find_first_bit: 187071076 ns, 16176 iterations



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230925/202309251423.8ef230c0-oliver.sang@xxxxxxxxx



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki