Re: [PATCH] [BUGFIX] crash/ioapic: Prevent crash_kexec() fromdeadlocking of ioapic_lock

From: Don Zickus
Date: Tue Aug 20 2013 - 10:28:18 EST


On Tue, Aug 20, 2013 at 03:12:32AM -0700, Eric W. Biederman wrote:
> Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@xxxxxxxxxxx> writes:
>
> > Hi Ingo,
> >
> > Thank you for fixing typos!
> > OK, I'll fix them and rename to ioapic_zap_locks().
> >
> > Thank you again!
>
>
> The better fix for this would be to remove the disable_IO_APIC call from
> crash_kexec.
>
> I know last time it was investigated the kernel was very close to
> working without needing that, and the code will be much more robust in
> the long term if we can avoid disabling them in the crashing kernel.
>
> Yoshihiro is there any chance you can look into removing the
> disable_IO_APIC entirely?
>
> The apic disablement and the disable_IO_APIC exists entirely due to
> limitations in the kernel boot path.

Yup. We went down this path a year ago:

https://lkml.org/lkml/2012/2/2/331

Then we got sidetracked and talked about removing the lapic stuff at
shutdown too:

http://lists.infradead.org/pipermail/kexec/2012-February/006017.html
(sorry couldn't find lkml link for some reason)

And the second patch was committed.

However, it was quickly reverted when Yinghai Lu noticed a problem:

https://lkml.org/lkml/2012/2/11/143

The problem stemmed from the fact that the nmi_watchdog caused an NMI in
the middle of transitioning between the two kernels (we didn't shutdown
the lapic) and caused a reset (there is no NMI handler in purgatory).

I think I dropped the ball in investigating how to write an idt for the
purgatory code to handle spurious NMIs.

Regardless of all that, I think if we stick to just removing the ioapic
shutdown code (ie the first patch linked above), we should be ok. I
believe my testing went smoothly. It was the lapic stuff that needed more
tweaking.

So, I agree with Eric, let's remove the disable_IO_APIC() stuff and keep
the code simpler.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/