Re: General protection fault in `switch_mm_irqs_off()`

From: Paul Menzel
Date: Wed Oct 02 2019 - 11:52:43 EST


[CC: +affected coreboot folks, +coreboot mailing list]

Dear Thomas,


More affected people discussed this issue on the coreboot mailing list [1].

On 2019-01-14 18:37, Lendacky, Thomas wrote:
> On 1/14/19 11:09 AM, Paul Menzel wrote:

>> On 01/14/19 18:00, Lendacky, Thomas wrote:
>>> On 1/10/19 12:34 PM, Lendacky, Thomas wrote:
>>>> On 1/10/19 10:49 AM, Paul Menzel wrote:
>>>>> Dear Boris, dear Thomas,
>>>>>
>>>>>
>>>>> On 01/10/19 17:00, Borislav Petkov wrote:
>>>>>> On Thu, Jan 10, 2019 at 02:57:40PM +0100, Paul Menzel wrote:
>>>>>>> Thank you very much. Indeed, the machine does not crash. I used Linusâ
>>>>>>> master branch for testing, and applied your patch on top. Please find
>>>>>>> the full log attached.
>>>>>>
>>>>>>> 80.649: [ 3.197107] Spectre V2 : spectre_v2_user_select_mitigation: set X86_FEATURE_USE_IBPB
>>>>>>
>>>>>> This is amazing.
>>>>>>
>>>>>> Ok, next diff, same exercise. Thx.>
>>>>>> ---
>>>>>> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
>>>>>> index dad12b767ba0..528ef8336f5f 100644
>>>>>> --- a/arch/x86/include/asm/nospec-branch.h
>>>>>> +++ b/arch/x86/include/asm/nospec-branch.h
>>>>>> @@ -284,6 +284,12 @@ static inline void indirect_branch_prediction_barrier(void)
>>>>>> {
>>>>>> u64 val = PRED_CMD_IBPB;
>>>>>>
>>>>>> + if (WARN_ON(boot_cpu_has(X86_FEATURE_USE_IBPB))) {
>>>>>> + pr_info("%s: c: %px, array: 0x%x\n",
>>>>>> + __func__, &boot_cpu_data, boot_cpu_data.x86_capability[7]);
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> alternative_msr_write(MSR_IA32_PRED_CMD, val, X86_FEATURE_USE_IBPB);
>>>>>> }
>>>>>>
>>>>>> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
>>>>>> index 8654b8b0c848..e818e5abe611 100644
>>>>>> --- a/arch/x86/kernel/cpu/bugs.c
>>>>>> +++ b/arch/x86/kernel/cpu/bugs.c
>>>>>> @@ -371,6 +371,9 @@ spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
>>>>>> if (boot_cpu_has(X86_FEATURE_IBPB)) {
>>>>>> setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
>>>>>>
>>>>>> + pr_err("%s: set X86_FEATURE_USE_IBPB, c: %px, array: 0x%x\n",
>>>>>> + __func__, &boot_cpu_data, boot_cpu_data.x86_capability[7]);
>>>>>> +
>>>>>> switch (cmd) {
>>>>>> case SPECTRE_V2_USER_CMD_FORCE:
>>>>>> case SPECTRE_V2_USER_CMD_PRCTL_IBPB:
>>>>>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>>>>>> index cb28e98a0659..8566737fa500 100644
>>>>>> --- a/arch/x86/kernel/cpu/common.c
>>>>>> +++ b/arch/x86/kernel/cpu/common.c
>>>>>> @@ -765,6 +765,9 @@ static void apply_forced_caps(struct cpuinfo_x86 *c)
>>>>>> c->x86_capability[i] &= ~cpu_caps_cleared[i];
>>>>>> c->x86_capability[i] |= cpu_caps_set[i];
>>>>>> }
>>>>>> +
>>>>>> + if (c == &boot_cpu_data)
>>>>>> + pr_info("%s: c: %px, array: 0x%x\n", __func__, c, c->x86_capability[7]);
>>>>>> }
>>>>>>
>>>>>> static void init_speculation_control(struct cpuinfo_x86 *c)
>>>>>> @@ -778,6 +781,10 @@ static void init_speculation_control(struct cpuinfo_x86 *c)
>>>>>> if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
>>>>>> set_cpu_cap(c, X86_FEATURE_IBRS);
>>>>>> set_cpu_cap(c, X86_FEATURE_IBPB);
>>>>>> +
>>>>>> + pr_info("%s: X86_FEATURE_SPEC_CTRL: c: %px, array: 0x%x, CPUID: 0x%x\n",
>>>>>> + __func__, c, c->x86_capability[7], cpuid_edx(7));
>>>>>> +
>>>>>> set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
>>>>>> }
>>>>>>
>>>>>> @@ -793,9 +800,13 @@ static void init_speculation_control(struct cpuinfo_x86 *c)
>>>>>> set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
>>>>>> }
>>>>>>
>>>>>> - if (cpu_has(c, X86_FEATURE_AMD_IBPB))
>>>>>> + if (cpu_has(c, X86_FEATURE_AMD_IBPB)) {
>>>>>> set_cpu_cap(c, X86_FEATURE_IBPB);
>>>>>>
>>>>>> + pr_info("%s: X86_FEATURE_AMD_IBPB: c: %px, array: 0x%x, CPUID: 0x%x\n",
>>>>>> + __func__, c, c->x86_capability[7], cpuid_ebx(0x80000008));
>>>>>> + }
>>>>>> +
>>>>>> if (cpu_has(c, X86_FEATURE_AMD_STIBP)) {
>>>>>> set_cpu_cap(c, X86_FEATURE_STIBP);
>>>>>> set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
>>>>>
>>>>> Please find the logs attached.
>>>>
>>>> Ah, so the CPUID value is showing X86_FEATURE_AMD_IBPB (not sure why the
>>>> cpuid command was showing a value of zero for EBX in your previous email).
>>>> Let me see what I can find out about this processor/firmware relation. I
>>>> wouldn't expect to see the #GP given that the firmware says IBPB is
>>>> supported.
>>>
>>> I'm not able to reproduce this issue on my family 21, model 1, stepping 2
>>> processor (AMD Opteron(TM) Processor 6274) as I am able to successfully
>>> write to the PRED_CMD MSR.
>>
>> Itâs not exactly the same processor, but I guess the same family should be
>> good enough. What board do you have? Do you have two sockets, and both
>> populated?
>
> Yes, It is a two-socket system with two processors installed.
>
>> Here is an Asus KGPE-D16 with two AMD Opterons put in.
>>
>> Lastly, my microcode updates are applied in firmware, and not by GNU/Linux.
>
> Ok, I was confused on how you had reported that, sorry.

Kinky reports, that populating the memory slots of both NUMA nodes fixes this.
Kinky, what slots do you have exactly populated?

I havenât been able to verify that yet, but please find my output of
`sudo dmidecode -t memory` with a 8 * 16 GB system attached, which is
affected.

> Can we try an experiment where you use the older version of the Asus
> firmware but build an initramfs that will perform early microcode loading?
> I'm curious if things will work when loaded via Linux.

I believe the users reported that this works.

>>> Let's check the firmware file that you're loading. The one I'm using is:
>>>
>>> $ sha1sum /lib/firmware/amd-ucode/microcode_amd_fam15h.bin
>>> 90896256951d8edf7baf8181ae11e2dc618a5171 /lib/firmware/amd-ucode/microcode_amd_fam15h.bin
>>>
>>> Does that match what you have?
>>
>> Yes, that matches exactly.
>>
>> 90896256951d8edf7baf8181ae11e2dc618a5171 3rdparty/blobs/cpu/amd/family_15h/microcode_amd_fam15h.bin


Kind regards,

Paul


[1]: https://mail.coreboot.org/hyperkitty/list/coreboot@xxxxxxxxxxxx/thread/QZIVOD4UADLLPZEE7MFUUTQQM343GKOC/
# dmidecode 3.0
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.

Handle 0x0006, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: Single-bit ECC
Maximum Capacity: 256 GB
Error Information Handle: Not Provided
Number Of Devices: 8

Handle 0x0007, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 0 DIMM_A2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: 4E156411
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x0008, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 0 DIMM_B2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: 4D156411
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x0009, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 0 DIMM_C2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: 9B924E13
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x000A, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 0 DIMM_D2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: BF924E13
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x000B, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 1 DIMM_A2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: 93D4D012
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x000C, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 1 DIMM_B2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: 6B6FD112
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x000D, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 1 DIMM_C2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: F1136411
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Handle 0x000E, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 16384 MB
Form Factor: DIMM
Set: None
Locator: NODE 1 DIMM_D2
Bank Locator: Not Specified
Type: DDR3
Type Detail: Synchronous Registered (Buffered)
Speed: 800 MHz
Manufacturer: Crucial
Serial Number: F3146411
Asset Tag: Not Specified
Part Number: 36KSF2G72PZ-1G6P1
Rank: 2
Configured Clock Speed: 800 MHz
Minimum Voltage: 1.35 V
Maximum Voltage: 1.5 V
Configured Voltage: 1.5 V

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature