Re: SMP _death_

Ingo Molnar (mingo@pc5829.hil.siemens.at)
Mon, 28 Apr 1997 20:22:53 +0200 (MET DST)


On Mon, 28 Apr 1997, David S. Miller wrote:

> Date: Mon, 28 Apr 1997 11:50:52 -0400 (EDT)
> From: "Richard B. Johnson" <root@analogic.com>
>
> > Although this can't possibly be the right fix, it can hint us as
> > to where it really is. All this patch does is single thread all
> > interrupt handling, which means there is a re-entrancy problem in
> > some driver still which has yet to be resolved.
>
> Well I can even tell you the driver. However, the problem will
> persist for all other drivers unless they are rewritten -- and I'm
> sure that nobody wants to do that. I am quite aware what the patch
> does.
>
> Incorrect, I'd say %95 or more of the drivers behave properly under
> the new scheme. There is one, only one, case where a driver would
> need to be modified in some way to work properly in the new scheme
> where all cpu's can service interrupts in parallel.
>
> This case, as noted here many times, is when a driver disables
> interrupts on the adapter itself and expects nobody else to be in an
> interrupt service routine for that card, they must do a
> synchronize_irq() after they disable interrupts on the _adapter_. (I
> stress this, because things work just fine if interrupts are disabled
> using disable_irq() or by using a cli())

plus, if really necessary we can make the detection of such problems
automatic, via generating an 'irq handler reentrancy map', which should be
the same for uniprocessor and SMP system, and which shows exactly the EIP
where syncronize_irq() should be done.

[to the original poster:] Tell me if you are ready to compile two kernels
with an experimental patch to get the "reentrancy map" out of the system.

-- mingo