Re: [patch] SMP race fix [was Re: SMP lockup & 3c509 on 2.2.x [aka.

Andrea Arcangeli (andrea@e-mind.com)
Thu, 6 May 1999 03:32:50 +0200 (CEST)


On Wed, 5 May 1999, Linus Torvalds wrote:

>I agree that the above is not necessarily _clever_ nor necessarily good
>programming practice. But I think it's even worse programming practice if
>we made it lock up with no messages for no really obvious reason, because
>it _is_ at least straightforward and obvious code and as such we can't
>really claim that it is outright evil.

Ah yes. I didn't thought about the case of sharing the same code for both
the irq handler and the bh handler. Probably something like this would be
better (then the caller in the unlikely case should use `if
(!in_interrupt()) disable_irq() else disable_irq_nosync()' to make sure
that he knows what it's doing and that disable_irq_nosync() is enough for
him.

(just for the record, this patch it's incremental with my last big patch,
but it's not needed to know if my fix works or not)

Index: arch/i386/kernel/irq.c
===================================================================
RCS file: /var/cvs/linux/arch/i386/kernel/irq.c,v
retrieving revision 1.1.2.28
diff -u -r1.1.2.28 irq.c
--- irq.c 1999/05/05 19:52:54 1.1.2.28
+++ linux/arch/i386/kernel/irq.c 1999/05/06 01:07:37
@@ -759,7 +759,7 @@
* hardware disable after having gotten the irq
* controller lock.
*/
-static inline __disable_irq(unsigned int irq)
+static inline void __disable_irq(unsigned int irq)
{
unsigned long flags;

@@ -774,8 +774,16 @@
void disable_irq(unsigned int irq)
{
__disable_irq(irq);
+
+ if (in_interrupt())
+ goto from_irq;
+
while (irq_desc[irq].status & IRQ_INPROGRESS)
barrier();
+ return;
+
+ from_irq:
+ printk(KERN_ERR "disable_irq: can't synchronize irq %d!\n", irq);
}

void disable_irq_nosync(unsigned int irq)

>So I fully agree with your point that there is a very subtle race in the
>disable_irq() path. It's just that I want to think more about the
>solution, and I'd ask you to think about it some more too.

Agreed, and I'll be glad to continue thinking about it some more ;)).

Basically the problems was two (I think I have mentioned them many times
but never with a scheme):

- disable_irq() was not waiting really for irq completation because
synchronize_irq() hadn't enough informations yet

- the no-wait probelm caused by the race above was allowing us to do
an enable_irq (done a bit after the disable_irq()) with the irq
still running and that caused us to clear the inprogress bit
of the running irq so allowing a second irq to reenter the first
irq.

If there wouldn't be the second problem (I consider it a problem) the
first race woudln't have harmed the network code that is just able to
resolve smp-threading between irq and normal context (that's why I
consider disable_irq_nosync() perfectly safe for the network code).

If there wouldn't be the first problem the second problem wouldn't show up
at all. That's why you claim that the irqinprogress current design is
right. But I don't think so.

>That kind of split would make sense, and would allow the autoirq
>behaviour, as well as noticing spurious interrupts with no handler (which

The autoprobe jus works fine.

>was a invaluable debugging tool for me where the current INPROGRESS
>semantics made a lot of sense).

With my patch the inprogress bit is still set if there is no handler
registered. But to catch it, you have to first clear the IRQ_DISABLED bit:
exactly what the probe_irq_on does: probe_irq_on first enable irqs and
then shutdown all the spurious interrupt that happened before we had the
time to issue an irq from our device. This way we won't catch wrong irqs
by spurious interrupts.

>Alternatively, maybe there are other approaches - and whatever we do I do
>believe that we need to handle the nesting case gracefully (your "nosync"
>doesn't really help - because it isn't useful as a above kind of
>exclusivity guarantee exactly because it doesn't wait, see?)

I don't see why we must be forced to wait. With my approch once we set the
IRQ_DISABLED bit from disable_irq_nosync() we are sure that noirq will run
anymore after we return from disable_irq_nosync() (so no risk to deadlock
due an interrupt after the call). And if there is a irq running it will be
far away from us. If the code after disable_irq() is able to run SMP
threaded (and infact it's using disable_irq_nosync() _only_ and exactly
because it want to avoid deadlocking in a spinlock) I think there's no
need to wait. At least if the further enable_irq() won't clear at random
time the IRQ_INPROGRESS bit (as happens now) while maybe there was still
the irq running on the other CPU...

Now it's really time to sleep for me... I'll continue thinking tomorrow
(and usually after a long sleep I think far better ;)).

Goodnight.

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/