Re: RFC for a new Scheduling policy/class in the Linux-kernel

From: Chris Friesen
Date: Mon Jul 13 2009 - 17:45:30 EST


Ted Baker wrote:

> I recognize that this complexity is a product of the desire to
> provide an implementation that does the right thing in all cases,
> but one needs keep a sense of proportion. When one ends up having
> to solve a more complex mutual exclusion problem (on the wait-for
> graph and task priorities) in order to implement a mutual
> exclusion primitive, you have a case of abstraction inversion--
> something is out of whack.

Given that the semantics of POSIX PI locking assumes certain scheduler
behaviours, is it actually abstraction inversion to have that same
dependency expressed in the kernel code that implements it?

> For schedulability analysis, one just needs a way to bound the
> duration of priority inversion. Simple non-preemption (Linux
> spinlock_t) is sufficient for that, and it is easy to implement.
> You just have to be careful not to voluntarily suspend (give up
> the processor) while holding a lock.

The whole point of mutexes (and semaphores) within the linux kernel is
that it is possible to block while holding them. I suspect you're going
to find it fairly difficult to convince people to spinlocks just to make
it possible to provide latency guarantees.

> The only selling point for PIP has been the ability of a thread to
> suspend itself while holding a lock, such as to wait for
> completion of an I/O operation.

You're comparing a full-featured PI implementation with a stripped-down
PP (priority protection, aka priority ceiling) approach. In an
apples-to-apples comparison, the selling point for PI vs PP is that
under PIP the priority of the lock holder is automatically boosted only
if necessary, and only as high as necessary. On the other hand, PP
requires code analysis to properly set the ceilings for each individual
mutex.

> I would argue that this practice
> is generally a sign of poor design, and it certainly throws out
> the notion of bounding the priority inversion due to blocking on a
> lock for schedulability analysis -- since now the lock-holding
> time can depend on I/O completion time, timers, etc.

Certainly if you block waiting for I/O while holding a lock then it
impacts the ability to provide latency guarantees for others waiting for
that lock. But this has nothing to do with PI vs PP or spinlocks, and
everything to do with how the lock is actually used.

> Regarding the notion of charging proxy execution to the budget of
> the client task, I have grave concerns. It is already hard enough
> to estimate the amount of budget that a real-time task requires,
> without this additional complication.

Agreed.

Chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/