Re: GHES: Failed to read error status

From: Don Zickus
Date: Thu Nov 17 2011 - 11:31:17 EST


On Tue, Nov 15, 2011 at 08:29:56AM -0700, Bjorn Helgaas wrote:
> [+linux-acpi]
>
> On Mon, Nov 14, 2011 at 11:36 AM, Dave Jones <davej@xxxxxxxxxx> wrote:
> > It appears that there's a problem with Dell poweredge servers
> > and GHES judging by the bug reports at
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=746755
> > https://bugs.launchpad.net/ubuntu/+bug/881164
> >
> > Is this likely to be something that Dell need to fix in a firmware update,
> > or something that the code needs to accomodate ?

I think one problem was that in 2.6.38 the kernel saw HEST/GHES was
supported and just tried to communicate with it. Unfortunately the kernel
forgot to tell the BIOS that it supports firmware first mode, so in this
case the firmware is probably blocking the GHES access and the kernel is
confused why.

The following commits resolved that issue (which may not entirely fix the
problem, but might move it along).

9fb0bfe ACPI, APEI, Add WHEA _OSC support
b3b46d7 APEI: Fix WHEA _OSC call

The second one in particular was noticed by Dell. Though I recall that we
needed to update the firmware to get Dell boxes working, but that was
probably for EINJ.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/