[PATCH v2 2/2] locking: Mark racy reads of owner->on_cpu

From: Kefeng Wang
Date: Fri Dec 03 2021 - 02:48:59 EST


From: Marco Elver <elver@xxxxxxxxxx>

One of the more frequent data races reported by KCSAN is the racy read
in mutex_spin_on_owner(), which is usually reported as "race of unknown
origin" without showing the writer. This is due to the racing write
occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:

| write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
| finish_task kernel/sched/core.c:4632 [inline]
| finish_task_switch kernel/sched/core.c:4848
| context_switch kernel/sched/core.c:4975 [inline]
| __schedule kernel/sched/core.c:6253
| schedule kernel/sched/core.c:6326
| schedule_preempt_disabled kernel/sched/core.c:6385
| __mutex_lock_common kernel/locking/mutex.c:680
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
| mutex_spin_on_owner kernel/locking/mutex.c:370
| mutex_optimistic_spin kernel/locking/mutex.c:480
| __mutex_lock_common kernel/locking/mutex.c:610
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| value changed: 0x00000001 -> 0x00000000

This race is clearly intentional, and the potential for miscompilation
is slim due to surrounding barrier() and cpu_relax(), and the value
being used as a boolean.

Nevertheless, marking this reader would more clearly denote intent and
make it obvious that concurrency is expected. Use READ_ONCE() to avoid
having to reason about compiler optimizations now and in future.

With previous refactor, mark the read to owner->on_cpu in owner_on_cpu(),
which immediately precedes the loop executing mutex_spin_on_owner().

Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
---
include/linux/sched.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ff609d9c2f21..0b9b0e3f4791 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2177,7 +2177,7 @@ static inline bool owner_on_cpu(struct task_struct *owner)
* As lock holder preemption issue, we both skip spinning if
* task is not on cpu or its cpu is preempted
*/
- return owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
+ return READ_ONCE(owner->on_cpu) && !vcpu_is_preempted(task_cpu(owner));
}

/* Returns effective CPU energy utilization, as seen by the scheduler */
--
2.26.2