Re: [V3][PATCH 3/6] x86, nmi: wire up NMI handlers to new routines

From: Borislav Petkov
Date: Wed Sep 07 2011 - 12:49:53 EST


On Tue, Sep 06, 2011 at 12:52:53PM -0400, Don Zickus wrote:

[..]

> > > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> > > index 08363b0..3fc65b6 100644
> > > --- a/arch/x86/kernel/cpu/mcheck/mce.c
> > > +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> > > @@ -908,9 +908,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
> > >
> > > percpu_inc(mce_exception_count);
> > >
> > > - if (notify_die(DIE_NMI, "machine check", regs, error_code,
> > > - 18, SIGKILL) == NOTIFY_STOP)
> > > - goto out;
> >
> > Yes, this code is strange. I checked all the nmi handlers but couldn't
> > find one that is direct related to this call. But it could be to
> > handle IPIs even in the case of an mce to let backtrace and reboot
> > work. CC'ing mce guys.
> >
> > I would rather add an nmi_handle() call here.
>
> I checked to and the code predates 2.6.12, so I have no idea why it was
> there. One of the reasons I wanted to remove it was to keep all the users
> internal to the nmi.c file. Also I remove most of the parameters from
> notify_die as they were not being used. I would hate to add them back in
> because of an mce hack.
>
> I'm sure after 4-5 years (whenever this was added), we can find a better
> way to do whatever it is doing, no?
>
> But if I have to support this call, it complicates all the changes I made
> unnecessarily. :-(

This code comes from a combined x86_64 update commit from 2003, AFAICT:

commit 3d71dbc9afbd7eecdc71e0329d6f16f2dcd48e39
Author: Andi Kleen <ak@xxxxxxx>
Date: Mon Mar 24 19:54:54 2003 -0800

[PATCH] x86-64 updates

Lots of x86-64 updates. Merge with 2.4 and NUMA works now. Also reenabled
the preemptive kernel. And some other bug fixes.
IOMMU disabled by default now because it has problems.

- Add more CONFIG options for device driver debugging and iommu
force/debug. (don't enable iommu force currently)
- Some S3/ACPI fixes/cleanups from Pavel.
....


and the file was called arch/x86_64/kernel/bluesmoke.c back then.
Unfortunately, nothing in the commit message hints at why it was added.
I guess it was some sort of a notification mechanism to warn the rest of
the kernel that we might die soon because we received an MCE, so that
prior can take some cleanup action before going down.

So I don't think we use it anywhere - originally Robert and I thought
that mce-inject.c relies on it indirectly but it does its own NMI
injection when the MCE needs to be broadcast and injected on all cores
and it also registers its own NMI notifier mce_raise_notify() which
you've already converted.

Tony, anything I'm missing?

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/