RE: [PATCH v2 1/2] x86: mce: kexec: turn off MCE in kexec
From: Luck, Tony
Date: Fri Feb 27 2015 - 13:27:27 EST
> When CR4.MCE=0b and an MCE happens, it will shutdown the system, at
> least on Intel, according to Tony
I checked with the architects ... and I was right. If you clear CR4.MCE you'll still
see the machine check - and you'll pull the big system reset lever.
If you think the other cpus can survive the reset - then the right thing to do is to
have any offline cpus that show up in the machine check handler just clear MCG_STATUS
and return:
do_machine_check()
{
/* offline cpus may show up for the party - but don't need to do anything here - send them back home */
if (!(cpu_online(smp_processor_id())) {
mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
return;
}
If we are crashing because of a machine check - I wonder how useful it is to run kdump(). There are a very
small set of ways that you can induce a machine check from program action - normally the problem is that
something bad happened in the h/w ... a kdump will just fill your disk and waste your time looking at what
the s/w was dong when the machine check happened.
-Tony
N§²æ¸yú²X¬¶ÇvØ)Þ{.nÇ·¥{±êX§¶¡Ü}©²ÆzÚj:+v¨¾«êZ+Êzf£¢·h§~Ûÿû®w¥¢¸?¨è&¢)ßfùy§m
á«a¶Úÿ0¶ìå