Re: [RFC] x86, NMI, Treat unknown NMI as hardware error

From: huang ying
Date: Fri May 13 2011 - 20:56:34 EST


On Fri, May 13, 2011 at 11:20 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * huang ying <huang.ying.caritas@xxxxxxxxx> wrote:
>
>> > What should be done instead is to add an event for unknown NMIs, which can
>> > then be processed by the RAS daemon to implement policy.
>> >
>> > By using 'active' event filters it could even be set on a system to panic
>> > the box by default.
>>
>> If there is real fatal hardware error, maybe we have no luxury to go from NMI
>> handler to user space RAS daemon to determine what to do. System may explode,
>> bad data may go to disk before that.
>
> That is why i suggested:
>
> Â> > By using 'active' event filters it could even be set on a system to panic
> Â> > the box by default.
>
> event filters are evaluated in the kernel, so the panic could be instantaneous,
> without the event having to reach user-space.

Yes. If we do that in kernel, that should be doable.

Does 'active' event filters have much difference with DIE_UNKNOWNNMI
notifier chain? What can we get from the added complexity? What do
you think is the better way to determine go panic on unknown NMI or
not?

Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/