Re: RT and Cascade interrupts

From: john cooper
Date: Fri May 27 2005 - 09:00:36 EST


Ingo Molnar wrote:
* john cooper <john.cooper@xxxxxxxxxxx> wrote:
The RPC code is attempting to replicate state of
timer ownership for a given rpc_task via RPC_TASK_HAS_TIMER
in rpc_task.tk_runstate. Besides not working
correctly in the case of preemptable context it is
a replication of state of a timer pending in the
cascade structure (ie: timer->base). The fix
changes the RPC code to use timer->base when
deciding whether an outstanding timer registration
exists during rpc_task tear down.

Note: this failure occurred in the 40-04 version of
the patch though it applies to more current versions.
It was seen when executing stress tests on a number
of PPC targets running on an NFS mounted root though
was not observed on a x86 target under similar
conditions.


should this fix go upstream too?

Yes. The RPC code is attempting to replicate existing
and easily accessible state information in a timer
structure. The simplistic means by which it does so
fails if ksoftirqd/rpc_run_timer() runs in preemptive
context.

-john


--
john.cooper@xxxxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/