Re: RT and Cascade interrupts

From: Ingo Molnar
Date: Mon May 30 2005 - 09:59:07 EST



* Trond Myklebust <trond.myklebust@xxxxxxxxxx> wrote:

> > Is this patch makes any sense?
>
> Yes. I agree the scenario is theoretically possible (so I can queue
> that patch up for you). I am not convinced it is a plausible
> explanation for what John claims to be seeing, though.

i've added this patch (and your debug asserts, except for the
rpc_delete_timer() one) to the -RT tree and i've removed the earlier
hack - perhaps John can re-run the test and see whether it still occurs
under -rc5-RT-V0.7.47-13 or later?

most races are much more likely to occur under PREEMPT_RT than under
other preemption models, but maybe there's something else going on as
well.

Ingo

--- linux/net/sunrpc/sched.c.orig
+++ linux/net/sunrpc/sched.c
@@ -135,8 +135,6 @@ __rpc_add_timer(struct rpc_task *task, r
static void
rpc_delete_timer(struct rpc_task *task)
{
- if (RPC_IS_QUEUED(task))
- return;
if (test_and_clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
del_singleshot_timer_sync(&task->tk_timer);
dprintk("RPC: %4d deleting timer\n", task->tk_pid);
@@ -337,6 +335,8 @@ static void __rpc_sleep_on(struct rpc_wa
void rpc_sleep_on(struct rpc_wait_queue *q, struct rpc_task *task,
rpc_action action, rpc_action timer)
{
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) != 0 ||
+ timer_pending(&task->tk_timer));
/*
* Protect the queue operations.
*/
@@ -566,7 +566,6 @@ static int __rpc_execute(struct rpc_task

BUG_ON(RPC_IS_QUEUED(task));

- restarted:
while (1) {
/*
* Garbage collection of pending timers...
@@ -594,6 +593,8 @@ static int __rpc_execute(struct rpc_task
unlock_kernel();
}

+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) != 0 ||
+ timer_pending(&task->tk_timer));
/*
* Perform the next FSM step.
* tk_action may be NULL when the task has been killed
@@ -607,6 +608,7 @@ static int __rpc_execute(struct rpc_task
unlock_kernel();
}

+ restarted:
/*
* Lockless check for whether task is sleeping or not.
*/
@@ -925,6 +927,8 @@ fail:

void rpc_run_child(struct rpc_task *task, struct rpc_task *child, rpc_action func)
{
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) != 0 ||
+ timer_pending(&task->tk_timer));
spin_lock_bh(&childq.lock);
/* N.B. Is it possible for the child to have already finished? */
__rpc_sleep_on(&childq, task, func, NULL);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/