Re: [PATCH 27/31] x86: MCE: Print header/footer only once for multipleMCEs

From: Hidetoshi Seto
Date: Wed May 27 2009 - 00:32:00 EST


Andi Kleen wrote:
> From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>
> When multiple MCEs are printed print the "HARDWARE ERROR" header
> and "This is not a software error" footer only once. This
> makes the output much more compact with many CPUs.
>
> Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 27 +++++++++++++++++----------
> 1 files changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 8c55c86..94dddc7 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -176,11 +176,13 @@ void mce_log(struct mce *mce)
> set_bit(0, &notify_user);
> }
>
> -static void print_mce(struct mce *m)
> +static void print_mce(struct mce *m, int *first)
> {
> - printk(KERN_EMERG "\n"
> - KERN_EMERG "HARDWARE ERROR\n"
> - KERN_EMERG
> + if (*first) {
> + printk(KERN_EMERG "\n" KERN_EMERG "HARDWARE ERROR\n");
> + *first = 0;
> + }
> + printk(KERN_EMERG
> "CPU %d: Machine Check Exception: %16Lx Bank %d: %016Lx\n",
> m->extcpu, m->mcgstatus, m->bank, m->status);
> if (m->ip) {

Why don't you have print_mce_head?

I suppose one possible benefit not to have it and do like this is that we
can suppress printing header if there is no log to print.
But on the other hand footer is always printed.

> @@ -200,9 +202,12 @@ static void print_mce(struct mce *m)
> printk(KERN_EMERG "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x\n",
> m->cpuvendor, m->cpuid, m->time, m->socketid,
> m->apicid);
> - printk(KERN_EMERG "This is not a software problem!\n");
> - printk(KERN_EMERG "Run through mcelog --ascii to decode "
> - "and contact your hardware vendor\n");
> +}
> +
> +static void print_mce_tail(void)
> +{
> + printk(KERN_EMERG "This is not a software problem!\n"
> + KERN_EMERG "Run through mcelog --ascii to decode and contact your hardware vendor\n");
> }
>
> #define PANIC_TIMEOUT 5 /* 5 seconds */

And if there is no log, do we still need to run mcelog?

> @@ -225,6 +230,7 @@ static void wait_for_panic(void)
> static void mce_panic(char *msg, struct mce *final, char *exp)
> {
> int i;
> + int first = 1;
>
> /*
> * Make sure only one CPU runs in machine check panic
> @@ -240,7 +246,7 @@ static void mce_panic(char *msg, struct mce *final, char *exp)
> struct mce *m = &mcelog.entry[i];
> if ((m->status & MCI_STATUS_VAL) &&
> !(m->status & MCI_STATUS_UC))
> - print_mce(m);
> + print_mce(m, &first);
> }
> /* Now print uncorrected but with the final one last */
> for (i = 0; i < MCE_LOG_LEN; i++) {
> @@ -248,12 +254,13 @@ static void mce_panic(char *msg, struct mce *final, char *exp)
> if (!(m->status & MCI_STATUS_VAL))
> continue;
> if (!final || memcmp(m, final, sizeof(struct mce)))
> - print_mce(m);
> + print_mce(m, &first);
> }
> if (final)
> - print_mce(final);
> + print_mce(final, &first);
> if (cpu_missing)
> printk(KERN_EMERG "Some CPUs didn't answer in synchronization\n");
> + print_mce_tail();
> if (exp)
> printk(KERN_EMERG "Machine check: %s\n", exp);
> if (mce_panic_timeout < panic_timeout)

If mce_panic is only caller of mce_log_tail (and possible mce_log_head),
it would be better to use printk directly than having such sub-function.


Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/