Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

From: Mike Galbraith
Date: Tue Nov 21 2017 - 04:18:30 EST


On Tue, 2017-11-21 at 09:37 +0100, Thomas Gleixner wrote:
> On Tue, 21 Nov 2017, Mike Galbraith wrote:
> > On Mon, 2017-11-20 at 16:33 -0500, Mikulas Patocka wrote:
> > >
> > > Is there some specific scenario where you need to call
> > > blk_schedule_flush_plug from rt_spin_lock_fastlock?
> >
> > Excellent question.  What's the difference between not getting IO
> > started because you meet a mutex with an rt_mutex under the hood, and
> > not getting IO started because you meet a spinlock with an rt_mutex
> > under the hood?  If just doing the mutex side puts this thing back to
> > sleep, I'm happy.
>
> Think about it from the mainline POV.
>
> The spinlock cannot ever go to schedule and therefore cannot create a
> situation which requires an unplug. The RT substitution of the spinlock
> with a rtmutex based sleeping spinlock should not change that at all.
>
> A regular mutex/rwsem etc. can and will unplug when the lock is contended
> and the caller blocks. The RT conversion of these locks to rtmutex based
> variants creates the problem: Unplug cannot be called when the task has
> pi_blocked_on set because the unplug path might content on yet another
> lock. So unplugging in the slow path before setting pi_blocked_on is the
> right thing to do.

Sure.  What alarms me about IO deadlocks reappearing after all this
time is that at the time I met them, I needed every last bit of that
patchlet I showed to kill them, whether that should have been the case
or not.  'course that tree contained roughly a zillion patches..

Whatever, time will tell if I'm properly alarmed, or merely paranoid :)

-Mike