Re: [RFC] x86, NMI, Treat unknown NMI as hardware error

From: Huang Ying
Date: Sun May 15 2011 - 21:09:52 EST


On 05/15/2011 02:34 PM, Cyrill Gorcunov wrote:
> On 05/15/2011 04:06 AM, huang ying wrote:
> ...
>>>
>>> yes, is not good. But at least we *must* provide a way to turn this new feature off
>>> via command line I think. One of a reason for me is perf unknown nmis (at moment we seems
>>> to have captured and cured all parasite NMIs sources but there is no guarantee we wont
>>> meet them in future due to some code change or whatever). And bloating trap.c with
>>> new if()'s is not that good I guess, that is why I asked if there a way to do all the
>>> work via notifiers ;)
>>
>> Yes. We should consider about perf unknown NMI issues. But compared
>> with pushing all magic to user, I think the better way is to have a
>> better default behavior in kernel. For example, we can turn off
>> unknown NMI as hwerr logic temporarily if there are more than 1 perf
>> NMI events in action. Is that reasonable?
>
> I'm personally fine even if it's enabled by default, only worried to have
> an option to disable hwerr from boot line.

The white list mechanism is not sufficient? Spurious unknown NMI can
occur on white list machines? People don't want to protect their data?

>> And, I am not a big fan of notifiers, that makes code hard to be
>> understood. If you have concerns about the size of traps.c, we can
>> move all NMI logic to a new file.
>
> Ying, the concern is rather related to the code scheme in general. Since
> we have notifiers I think the better way to be consistent here and use
> hwerr notifier too. But it's IMHO ;)

As for go notifiers or not. IMHO, a rule can be:

- If it is something like a driver, than it should go notifier
- If it is architectural/PC defacto standard, it can sit outside of
notifier.

I think that seeing unknown NMI as hardware error should be part of PC
defacto standard. Do you think so?

Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/