Re: [BUG] KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event

From: Mirsad Todorovac
Date: Sat Mar 09 2024 - 11:14:02 EST


Hi, Sean,

I am reducing Cc: so if you could please attach the decode_stacktrace.sh output of your build.

Thanks,
Mirsad

On 3/7/24 20:43, Sean Anderson wrote:
On 8/18/23 10:15, Mirsad Todorovac wrote:


On 8/17/23 21:32, Mirsad Todorovac wrote:
Hi,

This is your friendly bug reporter. :-)

The environment is the vanilla torvalds kernel on Ubuntu 22.04 LTS and a Ryzen 7950X assembled box.

[ 7145.500079] ==================================================================
[ 7145.500109] BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event

[ 7145.500139] write to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 26:
[ 7145.500155] tick_nohz_idle_stop_tick+0x3b1/0x4a0
[ 7145.500173] do_idle+0x1e3/0x250
[ 7145.500186] cpu_startup_entry+0x20/0x30
[ 7145.500199] start_secondary+0x129/0x160
[ 7145.500216] secondary_startup_64_no_verify+0x17e/0x18b

[ 7145.500245] read to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 16:
[ 7145.500261] tick_nohz_next_event+0xe7/0x1e0
[ 7145.500277] tick_nohz_get_sleep_length+0xa7/0xe0
[ 7145.500293] menu_select+0x82/0xb90
[ 7145.500309] cpuidle_select+0x44/0x60
[ 7145.500324] do_idle+0x1c2/0x250
[ 7145.500336] cpu_startup_entry+0x20/0x30
[ 7145.500350] start_secondary+0x129/0x160
[ 7145.500367] secondary_startup_64_no_verify+0x17e/0x18b

[ 7145.500392] value changed: 0x0000001a -> 0xffffffff

[ 7145.500411] Reported by Kernel Concurrency Sanitizer on:
[ 7145.500421] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G L 6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650 #35
[ 7145.500439] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
[ 7145.500449] ==================================================================

Please find dmesg log (what was available in the ring buffer) and the lshw output attached.

NOTE: There are no proprietary drivers loaded and Canonical livepatch isn't working for this kernel,
probably previous ACPI KCSAN BUGs turn this taint on.

Thank you very much for your time evaluating this report.

P.S.

According to Mr. Heo, I will add the decoded stacktrace to this bug report, which ought to have been done
in the first place :-/

[ 7145.500079] ==================================================================
[ 7145.500109] BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event

[ 7145.500139] write to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 26:
[ 7145.500155] tick_nohz_idle_stop_tick (kernel/time/tick-sched.c:904 kernel/time/tick-sched.c:1130)
[ 7145.500173] do_idle (kernel/sched/idle.c:215 kernel/sched/idle.c:282)
[ 7145.500186] cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
[ 7145.500199] start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
[ 7145.500216] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441)

[ 7145.500245] read to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 16:
[ 7145.500261] tick_nohz_next_event (kernel/time/tick-sched.c:868)
[ 7145.500277] tick_nohz_get_sleep_length (kernel/time/tick-sched.c:1250)
[ 7145.500293] menu_select (drivers/cpuidle/governors/menu.c:283)
[ 7145.500309] cpuidle_select (drivers/cpuidle/cpuidle.c:359)
[ 7145.500324] do_idle (kernel/sched/idle.c:208 kernel/sched/idle.c:282)
[ 7145.500336] cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
[ 7145.500350] start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
[ 7145.500367] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441)

[ 7145.500392] value changed: 0x0000001a -> 0xffffffff

[ 7145.500411] Reported by Kernel Concurrency Sanitizer on:
[ 7145.500421] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G L 6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650 #35
[ 7145.500439] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
[ 7145.500449] ==================================================================

The code seems to be:


/*
* If this CPU is the one which updates jiffies, then give up
* the assignment and let it be taken by the CPU which runs
* the tick timer next, which might be this CPU as well. If we
* don't drop this here the jiffies might be stale and
* do_timer() never invoked. Keep track of the fact that it
* was the one which had the do_timer() duty last.
*/
if (cpu == tick_do_timer_cpu) {
tick_do_timer_cpu = TICK_DO_TIMER_NONE;
→ ts->do_timer_last = 1;
} else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
ts->do_timer_last = 0;

}

but I am not sure why this happens.

Maybe a WRITE_ONCE(ts->do_timer_last, 1) is required?

Or is it another KCSAN false positive ...

I have seen a few more KCSAN reports regarding this variable:

[ 153.285199] ==================================================================
[ 153.292529] BUG: KCSAN: data-race in tick_nohz_next_event+0x15c/0x29c
[ 153.299118]
[ 153.300685] race at unknown origin, with read to 0xffffffc082aa1048 of 4 bytes by task 0 on cpu 2:
[ 153.309760] tick_nohz_next_event+0x15c/0x29c
[ 153.314256] tick_nohz_get_sleep_length+0xd8/0x118
[ 153.319186] menu_select+0x1e0/0xc08
[ 153.322875] cpuidle_select+0x4c/0x60
[ 153.326686] do_idle+0x1e0/0x2b4
[ 153.330062] cpu_startup_entry+0x34/0x3c
[ 153.334133] secondary_start_kernel+0x1e8/0x21c
[ 153.338821] __secondary_switched+0xb8/0xbc
[ 153.343143]
[ 153.344706] no locks held by swapper/2/0.
[ 153.348802] irq event stamp: 129586
[ 153.352370] hardirqs last enabled at (129585): [<ffffffc081acaa2c>] el1_interrupt+0x40/0x50
[ 153.360964] hardirqs last disabled at (129586): [<ffffffc0800d8fec>] do_idle+0xc8/0x2b4
[ 153.369132] softirqs last enabled at (129560): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
[ 153.377786] softirqs last disabled at (129549): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
[ 153.386457]
[ 153.388019] value changed: 0xffffffff -> 0x00000001
[ 153.392993]
[ 153.394555] Reported by Kernel Concurrency Sanitizer on:
[ 153.399947] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.6.20+ #102
[ 153.406595] Hardware name: xlnx,zynqmp (DT)
[ 153.410866] ==================================================================

and:

[ 75.362055] ==================================================================
[ 75.369461] BUG: KCSAN: data-race in tick_sched_do_timer / tick_sched_do_timer
[ 75.376854]
[ 75.378426] write to 0xffffffc082aa1048 of 4 bytes by interrupt on cpu 1:
[ 75.385335] tick_sched_do_timer+0x124/0x168
[ 75.389744] tick_sched_timer+0x48/0xc8
[ 75.393720] __hrtimer_run_queues+0x200/0x5f0
[ 75.398198] hrtimer_interrupt+0x1b0/0x3f4
[ 75.402417] arch_timer_handler_phys+0x40/0x54
[ 75.406983] handle_percpu_devid_irq+0xf8/0x20c
[ 75.411679] generic_handle_domain_irq+0x48/0x64
[ 75.416435] gic_handle_irq+0xc4/0x100
[ 75.420298] call_on_irq_stack+0x24/0x4c
[ 75.424360] do_interrupt_handler+0xc0/0xc4
[ 75.428692] el1_interrupt+0x34/0x50
[ 75.432415] el1h_64_irq_handler+0x18/0x24
[ 75.436668] el1h_64_irq+0x64/0x68
[ 75.440184] percpu_counter_add_batch+0xd8/0x1b8
[ 75.444967] set_pte_range+0x14c/0x2c8
[ 75.448838] filemap_map_pages+0x2dc/0xa38
[ 75.453091] __handle_mm_fault+0x1304/0x17dc
[ 75.457492] handle_mm_fault+0xa0/0x374
[ 75.461450] do_page_fault+0x1f8/0x620
[ 75.465365] do_translation_fault+0x118/0x138
[ 75.469887] do_mem_abort+0x5c/0xe4
[ 75.473516] el0_da+0x5c/0xac
[ 75.476623] el0t_64_sync_handler+0xe4/0x158
[ 75.481050] el0t_64_sync+0x190/0x194
[ 75.484834]
[ 75.486405] 5 locks held by (d-logind)/293:
[ 75.490693] #0: ffffff88076753c0 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x15c/0x5c0
[ 75.500441] #1: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: __handle_mm_fault+0x1260/0x17dc
[ 75.509755] #2: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: filemap_map_pages+0x0/0xa38
[ 75.518748] #3: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: __pte_offset_map+0x0/0x2a0
[ 75.527662] #4: ffffff880762b018 (ptlock_ptr(ptdesc)){+.+.}-{2:2}, at: __pte_offset_map_lock+0xa4/0x264
[ 75.537488] irq event stamp: 1761
[ 75.540890] hardirqs last enabled at (1760): [<ffffffc080bc1f78>] percpu_counter_add_batch+0x160/0x1b8
[ 75.550465] hardirqs last disabled at (1761): [<ffffffc081aca990>] el1_interrupt+0x24/0x50
[ 75.558893] softirqs last enabled at (1534): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
[ 75.567382] softirqs last disabled at (1529): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
[ 75.575889]
[ 75.577460] read to 0xffffffc082aa1048 of 4 bytes by interrupt on cpu 2:
[ 75.584273] tick_sched_do_timer+0x48/0x168
[ 75.588596] tick_sched_timer+0x48/0xc8
[ 75.592572] __hrtimer_run_queues+0x200/0x5f0
[ 75.597050] hrtimer_interrupt+0x1b0/0x3f4
[ 75.601269] arch_timer_handler_phys+0x40/0x54
[ 75.605835] handle_percpu_devid_irq+0xf8/0x20c
[ 75.610531] generic_handle_domain_irq+0x48/0x64
[ 75.615287] gic_handle_irq+0xc4/0x100
[ 75.619150] call_on_irq_stack+0x24/0x4c
[ 75.623204] do_interrupt_handler+0xc0/0xc4
[ 75.627535] el1_interrupt+0x34/0x50
[ 75.631258] el1h_64_irq_handler+0x18/0x24
[ 75.635512] el1h_64_irq+0x64/0x68
[ 75.639027] lock_acquire+0x13c/0x2f0
[ 75.642820] count_memcg_event_mm.part.0+0x70/0x2a8
[ 75.647829] handle_mm_fault+0x1d8/0x374
[ 75.651874] do_page_fault+0x1f8/0x620
[ 75.655789] do_mem_abort+0x5c/0xe4
[ 75.659417] el0_da+0x5c/0xac
[ 75.662524] el0t_64_sync_handler+0xe4/0x158
[ 75.666951] el0t_64_sync+0x190/0x194
[ 75.670736]
[ 75.672307] 2 locks held by systemd/1:
[ 75.676152] #0: ffffff88042627b0 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x15c/0x5c0
[ 75.685900] #1: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: count_memcg_event_mm.part.0+0x0/0x2a8
[ 75.695726] irq event stamp: 576347
[ 75.699302] hardirqs last enabled at (576346): [<ffffffc08001af70>] local_daif_restore+0x38/0x54
[ 75.708338] hardirqs last disabled at (576347): [<ffffffc081aca990>] el1_interrupt+0x24/0x50
[ 75.716940] softirqs last enabled at (576161): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
[ 75.725602] softirqs last disabled at (576154): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
[ 75.734283]
[ 75.735854] value changed: 0xffffffff -> 0x00000000
[ 75.740836]
[ 75.742407] Reported by Kernel Concurrency Sanitizer on:
[ 75.747808] CPU: 2 PID: 1 Comm: systemd Not tainted 6.6.20+ #101
[ 75.754290] Hardware name: xlnx,zynqmp (DT)
[ 75.758563] ==================================================================

The justification seems to be in tick_sched_do_timer:

/*
* Check if the do_timer duty was dropped. We don't care about
* concurrency: This happens only when the CPU in charge went
* into a long sleep. If two CPUs happen to assign themselves to
* this duty, then the jiffies update is still serialized by
* jiffies_lock.
*
* If nohz_full is enabled, this should not happen because the
* tick_do_timer_cpu never relinquishes.
*/

with the other assignment being in tick_nohz_stop_tick. I'm not familiar
enough with this code to say whether we should be using READ/WRITE_ONCE
or maybe just data_race (as would be implied by the comment above).

--Sean

[Embedded World 2024, SECO SpA]<https://www.messe-ticket.de/Nuernberg/embeddedworld2024/Register/ew24517689>