Re: [patch 2/2] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk

From: Ingo Molnar
Date: Thu May 15 2014 - 02:47:24 EST



A couple of suggestions:

1)

* Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> + if (requeue) {
> + if (waiter == rt_mutex_top_waiter(lock)) {

So we have a 'top_waiter' local variable already at this point, and we
use it here:

> + /* Boost the owner */
> + rt_mutex_dequeue_pi(task, top_waiter);
> + rt_mutex_enqueue_pi(task, waiter);
> + __rt_mutex_adjust_prio(task);
> +
> + } else if (top_waiter == waiter) {

To me it's a bit confusing that we have both the 'top_waiter' local
variable and also evaluate 'rt_mutex_top_waiter()' directly.

So what happens is that when we do the requeue, the top waiter might
change. I'd really suggest to signal that via naming - i.e. add
another local variable (which GCC will optimize out happily), named
descriptively:

orig_top_waiter = top_waiter;

and use that variable after that point.

> + /* Deboost the owner */
> + rt_mutex_dequeue_pi(task, waiter);
> + waiter = rt_mutex_top_waiter(lock);
> + rt_mutex_enqueue_pi(task, waiter);
> + __rt_mutex_adjust_prio(task);
> + }
> }

2)

Also another small flow of control side comment, because this code is
apparently rather tricky, I'd suggest a bit more explicit structure to
show the real flow of the logic: for example in the first reading of
the above block I mistakenly read it as a usual 'if () { } else { }'
block pattern - which it really isn't.

Something like this would be slightly easier to understand 'at a
glance', IMHO:

if (requeue) {
if (waiter == top_waiter) {
/* Boost the owner */
rt_mutex_dequeue_pi(task, orig_top_waiter);
rt_mutex_enqueue_pi(task, waiter);
__rt_mutex_adjust_prio(task);

} else {
if (orig_top_waiter == waiter) {
/* Deboost the owner */
rt_mutex_dequeue_pi(task, waiter);
waiter = rt_mutex_top_waiter(lock);
rt_mutex_enqueue_pi(task, waiter);
__rt_mutex_adjust_prio(task);
} else {
/* The requeueing did not affect us, no need to boost or deboost */
}
}
}

Assuming you agree with this structure, it's a bit more verbose, but
this might be one of the cases where verbosity helps readability.
(Note that I already propagated the 'orig_top_waiter' name into it.)

3)

Also note how the code continues:

raw_spin_unlock_irqrestore(&task->pi_lock, flags);

top_waiter = rt_mutex_top_waiter(lock);
raw_spin_unlock(&lock->wait_lock);

if (!detect_deadlock && waiter != top_waiter)
goto out_put_task;

goto again;

So we evaluate 'top_waiter' again - maybe we could move that line to
the two branches that actually have a chance to change the top waiter,
and not change it in the 'no need to requeue' case.

So ... all in one, what I would suggest is something like the patch
below, on top of your two patches. Totally untested and such.

Thanks,

Ingo

=======================>
Subject: locking/rtmutex: Clean up the rt_mutex_adjust_prio_chain() code flow
From: Ingo Molnar <mingo@xxxxxxxxxx>
Date: Thu May 15 08:39:42 CEST 2014

Clean up the code flow and variable names, always precisely
maintaining the 'top_waiter' and 'orig_top_waiter' values whenever
they can change.

This probably optimizes the !requeue case a bit as well.

Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/locking/rtmutex.c | 35 +++++++++++++++++++++--------------
1 file changed, 21 insertions(+), 14 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -284,7 +284,7 @@ static int rt_mutex_adjust_prio_chain(st
struct rt_mutex_waiter *orig_waiter,
struct task_struct *top_task)
{
- struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
+ struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter, *orig_top_waiter;
int detect_deadlock, ret = 0, depth = 0;
struct rt_mutex *lock;
unsigned long flags;
@@ -380,13 +380,17 @@ static int rt_mutex_adjust_prio_chain(st
goto out_unlock_pi;
}

- top_waiter = rt_mutex_top_waiter(lock);
-
if (requeue) {
+ orig_top_waiter = rt_mutex_top_waiter(lock);
+
/* Requeue the waiter */
rt_mutex_dequeue(lock, waiter);
waiter->prio = task->prio;
rt_mutex_enqueue(lock, waiter);
+
+ top_waiter = rt_mutex_top_waiter(lock);
+ } else {
+ orig_top_waiter = top_waiter = rt_mutex_top_waiter(lock);
}

/* Release the task */
@@ -401,8 +405,8 @@ static int rt_mutex_adjust_prio_chain(st
* If the requeue above changed the top waiter, then we need
* to wake the new top waiter up to try to get the lock.
*/
- if (top_waiter != rt_mutex_top_waiter(lock))
- wake_up_process(rt_mutex_top_waiter(lock)->task);
+ if (top_waiter != orig_top_waiter)
+ wake_up_process(top_waiter->task);
raw_spin_unlock(&lock->wait_lock);
goto out_put_task;
}
@@ -414,24 +418,27 @@ static int rt_mutex_adjust_prio_chain(st
raw_spin_lock_irqsave(&task->pi_lock, flags);

if (requeue) {
- if (waiter == rt_mutex_top_waiter(lock)) {
+ if (waiter == top_waiter) {
/* Boost the owner */
- rt_mutex_dequeue_pi(task, top_waiter);
+ rt_mutex_dequeue_pi(task, orig_top_waiter);
rt_mutex_enqueue_pi(task, waiter);
__rt_mutex_adjust_prio(task);

- } else if (top_waiter == waiter) {
- /* Deboost the owner */
- rt_mutex_dequeue_pi(task, waiter);
- waiter = rt_mutex_top_waiter(lock);
- rt_mutex_enqueue_pi(task, waiter);
- __rt_mutex_adjust_prio(task);
+ } else {
+ if (orig_top_waiter == waiter) {
+ /* Deboost the owner */
+ rt_mutex_dequeue_pi(task, waiter);
+ waiter = rt_mutex_top_waiter(lock);
+ rt_mutex_enqueue_pi(task, waiter);
+ __rt_mutex_adjust_prio(task);
+ } else {
+ /* The requeueing did not affect us, no need to boost or deboost */
+ }
}
}

raw_spin_unlock_irqrestore(&task->pi_lock, flags);

- top_waiter = rt_mutex_top_waiter(lock);
raw_spin_unlock(&lock->wait_lock);

if (!detect_deadlock && waiter != top_waiter)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/