Re: [PATCH v3 0/9] Parallel CPU bringup for x86_64

From: Tom Lendacky
Date: Fri Dec 17 2021 - 14:46:35 EST


On 12/17/21 1:11 PM, David Woodhouse wrote:
On Fri, 2021-12-17 at 11:48 -0600, Tom Lendacky wrote:
On 12/16/21 6:13 PM, David Woodhouse wrote:
On Thu, 2021-12-16 at 16:52 -0600, Tom Lendacky wrote:
On baremetal, I haven't seen an issue. This only seems to have a problem
with Qemu/KVM.

With 191f08997577 I could boot without issues with and without the
no_parallel_bringup. Only after I applied e78fa57dd642 did the failure happen.

With e78fa57dd642 I could boot 64 vCPUs pretty consistently, but when I
jumped to 128 vCPUs it failed again. When I moved the series to
df9726cb7178, then 64 vCPUs also failed pretty consistently.

Strange thing is it is random. Sometimes (rarely) it works on the first
boot and then sometimes it doesn't, at which point it will reset and
reboot 3 or 4 times and then make it past the failure and fully boot.

Hm, some of that is just artifacts of timing, I'm sure. But now I'm
staring at the way that early_setup_idt() can run in parallel on all
CPUs, rewriting bringup_idt_descr and loading it.

To start with, let's try unlocking the trampoline_lock much later,
after cpu_init_exception_handling() has loaded the real IDT.

I think we can probably make secondaries load the real IDT early and
never use bringup_idt_descr at all, can't we? But let's see if this
makes it go away, to start with...


This still fails. I ran with -d cpu_reset on the command line and will
forward the full log to you. I ran "grep "[ER]IP=" stderr.log | uniq -c"
and got:

128 EIP=00000000 EFL=00000000 [-------] CPL=0 II=0 A20=0 SMM=0 HLT=0
128 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
These are before running any of the vCPUs.

1 RIP=ffffffff810705c6 RFL=00000206 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
This is where vCPU0 is at the time of the reset. This address tends to
be different all the time and so I think it is just where it happens to
be when the reset occurs and isn't contributing to the reset.

I note that one is in native_write_msr() though. I wonder what it's writing?

Do you have console output (perhaps with earlyprintk=ttyS0) to go with this?

Yes, but it's not really much help...

[ 0.146318] Freeing SMP alternatives memory: 36K
[ 0.249121] smpboot: CPU0: AMD EPYC Processor (family: 0x17, model: 0x1, stepping: 0x2)
[ 0.249291] Performance Events: AMD PMU driver.
[ 0.249771] ... version: 0
[ 0.250170] ... bit width: 48
[ 0.250258] ... generic registers: 4
[ 0.250662] ... value mask: 0000ffffffffffff
[ 0.251258] ... max period: 00007fffffffffff
[ 0.251790] ... fixed-purpose events: 0
[ 0.252258] ... event mask: 000000000000000f
[ 0.252972] rcu: Hierarchical SRCU implementation.
[ 0.255797] smp: Bringing up secondary CPUs ...
[ 0.256372] x86: Booting SMP configuration:
SecCoreStartupWithStack(0xFFFCC000, 0x820000)
Register PPI Notify: DCD0BE23-9586-40F4-B643-06522CED4EDE
Install PPI: 8C8CE578-8A3D-4F1C-9935-896185C32DD3
Install PPI: 5473C07A-3DCB-4DCA-BD6F-1E9689E7349A
The 0th FV start address is 0x00000820000, size is 0x000E0000, handle is 0x820000

There's no WARN or PANIC, just a reset. I can look to try and capture some
KVM trace data if that would help. If so, let me know what events you'd
like captured.



CPU Reset (CPU 23)
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0

This one we suspect. Is this what a triple-fault would look like? Not
if it's *already* at f000:fff0, surely?

Good question. The APM doesn't really document it. I'll see if I can find
some h/w folks to check with.

Thanks,
Tom


CPU Reset (CPU 23)
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00800f12
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009a00
SS =0000 00000000 0000ffff 00009200
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008300
GDT= 00000000 0000ffff
IDT= 00000000 0000ffff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000000 CCD=00000000 CCO=DYNAMIC
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=0000000000000000 0000000000000000 XMM01=0000000000000000 0000000000000000
XMM02=0000000000000000 0000000000000000 XMM03=0000000000000000 0000000000000000
XMM04=0000000000000000 0000000000000000 XMM05=0000000000000000 0000000000000000
XMM06=0000000000000000 0000000000000000 XMM07=0000000000000000 0000000000000000