[PATCH] perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue

From: Ingo Molnar
Date: Fri Mar 25 2011 - 05:24:23 EST


Eric Dumazet reported that hardware PMU events do not work on his
system, due to the BIOS corrupting PMU state:

Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)

Linus suggested that we continue in the face of such BIOS-induced CPU
state corruption:

http://lkml.org/lkml/2011/3/24/608

Such BIOSes will have to be fixed - developers rely on a working and fully
capable PMU and BIOS interfering with CPU state is simply not acceptable.

So this patch changes perf to continue when it detects such BIOS
interaction, some hardware events may be unreliable due to the BIOS writing
and re-writing them - there's not much the kernel can do about that.

Reported-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
arch/x86/kernel/cpu/perf_event.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index ec46eea..eb00677 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -500,12 +500,17 @@ static bool check_hw_exists(void)
return true;

bios_fail:
- printk(KERN_CONT "Broken BIOS detected, using software events only.\n");
+ /*
+ * We still allow the PMU driver to operate:
+ */
+ printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n");
printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
- return false;
+
+ return true;

msr_fail:
printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
+
return false;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/