Re: RT-Linux and SMP

Linus Torvalds (torvalds@transmeta.com)
Thu, 24 Apr 1997 10:03:59 -0700 (PDT)


On Thu, 24 Apr 1997, Victor Yodaiken wrote:
>
> We need some info to get RT-Linux SMP working.
>
> 1. The scheme in .35 for SMP allows multiple ISRs to run in
> different processors and synchronizes only on a cli. A cli
> only completes when the cli-ing processor has locked further
> ISR invocations and has waited until no other processors are in
> irq code.
>
> Question : Why is it not enough to just make sure that no
> other processor is in cli-mode? We already allow
> more than one ISR to be active without CLI.

We _have_ to wait for all outstanding interrupt handlers to complete,
because otherwise "cli()" would be totally useless. For example:

cli():
.. do critical thing ..
sti():

if the cli() only makes sure that this CPU is the only one currently in
"cli mode", there could be an interrupt executing on another CPU that
started _before_ we did the cli(), and happens to be executing during our
critical region. If that interrupt then happens to use or modify the same
data that we are using in the critical region, we're dead.

In short, unless "cli()" waits for all outstanding (*) interrupts to
finish, cli() is meaningless.

(*) On other CPU's - we can't wait for our _own_ CPU interrupts to finish,
because if we're in an interrupt handler when we do the cli(), we'll be
waiting for outself :)

If you want less espensive mutual exclusion, you should never use cli():
you should instead use a spinlock. The spinlock will lock only that
particular critical region against others who try to enter that critical
region, and that can in general be done much faster than disabling
interrupts globally. But the global irq disable _has_ to do what we do
now.

> 2. There was a previous discussion about whether interrupts needed
> to be disabled in the controller. Someone said that level triggered
> devices would cause lock-up unless this were done.
>
> Question: Is this correct?

This is correct. Essentially, level-triggered devices can be handled in
two ways:

- disable interrupts in the interrupt controller before enabling other
interrupts again (which is what Linux currently does)
- disable _all_ interrupts until we have made sure that the device is no
longer enabling the irq line. This can be done either by having the CPU
not accept interrupts, or by not acknowledging the irq to the interrupt
controller.

Of the two choices, the current Linux behaviour is obviously the better
one. In many cases it can take quite a while before we can be sure that
the device is no longer asserting the irq line (in some cases this is just
a matter of reading the status register of the device, but in other cases
it may be a question of reading all pending data off the device etc).

> Question: What devices need to be level triggered and why?

Just about all devices should be level triggered - you can't share
interrupts otherwise. For historical reasons the basic ISA devices are
_not_ level triggered, but they should really be considered a broken
design.

(Edge-triggering simplifies some things, and wouldn't result in the
deadlock of just spinning on interrupt routine entry, but that does not
really excuse some of the other braindamages of edge triggering).

Linus