[PATCH v2 0/4] locking/rtmutex: Avoid overwriting pi_blocked_on while invoking blk_flush_plug().

From: Sebastian Andrzej Siewior
Date: Thu Apr 27 2023 - 07:20:11 EST


Hi,

Crystal Wood reported that task_struct::pi_blocked_on can be overwritten
by mistake that is:
rt_mutex_slowlock()
- task_blocks_on_rt_mutex()
- current->pi_blocked_on = waiter;
- rt_mutex_slowlock_block()
- schedule()
- sched_submit_work()
- blk_flush_plug()
- *any* RT sleeping lock used by the plug
- rtlock_slowlock_locked()
- task_blocks_on_rt_mutex()
- current->pi_blocked_on = waiter; <-- XXX

The requirement is
- I/O queued
- lock contention on a sleeping lock (a mutex_t)
- lock contention while flushing queued I/O (in blk_flush_plug(), a
spin_lock_t on PREEMPT_RT).

Later in review it was pointed out by tglx that any function within
sched_submit_work() is affected so it is not limited to
blk_flush_plug().

This series addresses the problem by
- export sched_submit_work()
- invoke sched_submit_work() if it is clear that the slow path is
needed.
- invoke schedule_rtmutex() while blocking on lock which contains only
the schedule loop (without sched_submit_work().

Original report by Crystal
https://lore.kernel.org/all/4b4ab374d3e24e6ea8df5cadc4297619a6d945af.camel@xxxxxxxxxx

v1: https://lore.kernel.org/all/20230322162719.wYG1N0hh@xxxxxxxxxxxxx

v1…v2:
- Avoid invoking blk_flush_plug() with DEBUG-enabled
- Fix also the ww-mutex implementation based on RT-mutex.
- Export sched_submit_work() and do the whole block before blocking
not just blk_flush_plug().

Sebastian