Re: Hang when booting guest kernels compiled with clang after SRSO mitigations

From: Sean Christopherson
Date: Fri Aug 11 2023 - 12:03:24 EST


On Fri, Aug 11, 2023, Nathan Chancellor wrote:
> On Fri, Aug 11, 2023 at 12:14:56PM +0200, Borislav Petkov wrote:
> > On Thu, Aug 10, 2023 at 09:14:14AM -0700, Nathan Chancellor wrote:
> > > Not sure how helpful that will be...
> >
> > Yeah, not really. More wild guesses: if you uncomment the UNTRAIN_RET in
> > __svm_vcpu_run() on the host, does that have any effect? Diff below.
>
> Unfortunately, that seems to make no difference...
>
> I did have to switch to the Ryzen 3 box for testing, as I am not at home
> for a couple of days and I did not want to lose access to my workstation
> if I took a bad update since it has no remote management capabilities.
> Something I noticed in doing so is that the VM boot on that machine
> appears to get farther along than on my Threadripper 3990X, but I still
> see a hang with a stack trace similar to the one that I reported in the
> initial post with '-smp 2', so I think it is the same problem but
> perhaps the more cores the VM has, the more likely it is to appear
> totally hung? Might be a red herring but I figured I would mention it in
> case it is relevant.

Might be the flags bug that borks KVM's fastop() emulation. If that fixes things,
my guess is that bringing APs out of WFS somehow triggers emulation.

https://lore.kernel.org/all/20230811155255.250835-1-seanjc@xxxxxxxxxx