Re: [RFC] x86/mce: Add workaround for SKX/CLX/CPX spurious machine checks

From: Jue Wang
Date: Tue Feb 08 2022 - 10:04:43 EST


Thanks for the feedback, Tony and Borislav.

I will send out an updated patch.

Thanks,
-Jue

On Mon, Feb 7, 2022 at 1:51 PM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>
> >> The erratum has made its way through to the public specification
> >> update yet :-(
> >
> > You mean "has not"?
>
> Curse my pathetic typing skills ... "has NOT made its way" is where we are today.
> I don't know when that status will change.
>
> > In any case, I guess you could say something like:
> >
> > pr_err_once("Erratum #XXX detected, disabling fast string copy instructions.\n");
> >
> > or so and people can search with the erratum number later where the doc
> > will explain it in more detail.
>
> When the errata (plural, there are separate lists for SKX and CLX) go public
> we could update this message with the names.
We've found this message in combination with logging about the
faulting process info
in do_machine_check helpful when analyzing the originating context of the MCEs.
>
> -Tony