Re: [PATCH] prevent sparc64 from invoking irq handlers on offlineCPUs

From: David Miller
Date: Wed Sep 03 2008 - 05:21:56 EST


From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Date: Tue, 2 Sep 2008 17:42:11 -0700

> On Tue, Sep 02, 2008 at 05:16:30PM -0700, David Miller wrote:
> > So I'd like to hold off on this patch until this locking issue is
> > resolved.
>
> OK, it is your architecture. But in the meantime, sparc64 can take
> interrupts on CPUs whose cpu_online_map bits have been cleared.

Paul, here is how I resolved this in my tree.

First, I applied a patch that killed that 'call_lock' and replaced
the accesses with ipi_call_lock() and ipi_call_unlock().

Then I sed'd up your patch so that it applies properly after that
change.

I still think there will be a problem here on sparc64. I had the
online map clearing there happening first because the fixup_irqs()
thing doesn't drain interrupts. It just makes sure that "device"
interrupts no longer point at the cpu. So all new device interrupts
after fixup_irqs() will not go to the cpu.

Then we do the:

local_irq_enable();
mdelay(1);
local_irq_disable();

thing to process any interrupts which were sent while we were
retargetting the device IRQs.

I also intended this to drain the cross-call interrupts too, that's
why I cleared the cpu_online_map() bit before fixup_irqs() and
the above "enable/disable" sequence runs.

With your change in there now, IPIs won't get drained and the system
might get stuck as a result.

I wonder if it would work if we cleared the cpu_online_map right
before the "enable/disable" sequence, but after fixup_irqs()?

Paul, what do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/