Re: RT and Cascade interrupts

From: john cooper
Date: Mon May 30 2005 - 16:38:50 EST


Trond Myklebust wrote:

I've appended a patch that
should check for strict compliance of the above rules. Could you try it
out and see if it triggers any Oopses?

Yes, the assert in rpc_delete_timer() occurs just before
the cascade list corruption. This is consistent with
what I have seen. ie: the timer in a released rpc_task
is still active.

BTW, the patch from Oleg is relative to 2.6.12 and didn't
look to apply to the 2.6.11-derived base with which I'm
working (the RPC_IS_QUEUED() test at the head of rpc_delete_timer()
does not exist). In any case the relocation of restarted: in
__rpc_execute() did not influence the failure. I'd like to
move to a 2.6.12-based -RT patch however I'm dealing with
"code in the pipe" and unfortunately don't have that option.

Sorry I'm just responding new. We're in the middle of a
long holiday weekend. I will have more time come tomorrow
to analyze this further.

-john


------------------------------------------------------------------------

sched.c | 8 ++++++++
1 files changed, 8 insertions(+)

Index: linux-2.6.12-rc4/net/sunrpc/sched.c
===================================================================
--- linux-2.6.12-rc4.orig/net/sunrpc/sched.c
+++ linux-2.6.12-rc4/net/sunrpc/sched.c
@@ -135,6 +135,8 @@ __rpc_add_timer(struct rpc_task *task, r
static void
rpc_delete_timer(struct rpc_task *task)
{
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) == 0 &&
+ timer_pending(&task->tk_timer));
if (RPC_IS_QUEUED(task))
return;
if (test_and_clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
@@ -337,6 +339,8 @@ static void __rpc_sleep_on(struct rpc_wa
void rpc_sleep_on(struct rpc_wait_queue *q, struct rpc_task *task,
rpc_action action, rpc_action timer)
{
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) != 0 ||
+ timer_pending(&task->tk_timer));
/*
* Protect the queue operations.
*/
@@ -594,6 +598,8 @@ static int __rpc_execute(struct rpc_task
unlock_kernel();
}
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) != 0 ||
+ timer_pending(&task->tk_timer));
/*
* Perform the next FSM step.
* tk_action may be NULL when the task has been killed
@@ -925,6 +931,8 @@ fail:
void rpc_run_child(struct rpc_task *task, struct rpc_task *child, rpc_action func)
{
+ BUG_ON(test_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate) != 0 ||
+ timer_pending(&task->tk_timer));
spin_lock_bh(&childq.lock);
/* N.B. Is it possible for the child to have already finished? */
__rpc_sleep_on(&childq, task, func, NULL);


--
john.cooper@xxxxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/