Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest

From: Tom Lendacky
Date: Wed Mar 06 2024 - 11:26:58 EST


On 3/5/24 18:19, Sean Christopherson wrote:
On Tue, Mar 05, 2024, Tom Lendacky wrote:
On 3/4/24 11:55, Sean Christopherson wrote:
+Tom

"KVM: SVM:" for the shortlog scope.

On Fri, Mar 01, 2024, Zheyun Shen wrote:
On AMD CPUs without ensuring cache consistency, each memory page reclamation in
an SEV guest triggers a call to wbinvd_on_all_cpus, thereby affecting the
performance of other programs on the host.

Typically, an AMD server may have 128 cores or more, while the SEV guest might only
utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs
to specific physical CPUs.

Therefore, keeping a record of the physical core numbers each time a vCPU runs
can help avoid flushing the cache for all CPUs every time.

This needs an unequivocal statement from AMD that flushing caches only on CPUs
that do VMRUN is sufficient. That sounds like it should be obviously correct,
as I don't see how else a cache line can be dirtied for the encrypted PA, but
this entire non-coherent caches mess makes me more than a bit paranoid.

As long as the wbinvd_on_all_cpus() related to the ASID flushing isn't
changed, this should be ok. And the code currently flushes the source pages
when doing LAUNCH_UPDATE commands and adding encrypted regions, so should be
good there.

Nice, thanks!

Would it make sense to make this configurable, with the current behavior the
default, until testing looks good for a while?

I don't hate the idea, but I'm inclined to hit the "I'm feeling lucky" button.
I would rather we put in effort to all but guarantee we can do a clean revert in
the future, at which point a kill switch doesn't add all that much value. E.g.
it would allow for a non-disruptive fix, and maybe a slightly faster confirmation
of a bug, but that's about it.

And since the fallout from this would be host data corruption, _not_ rebooting
hosts that may have been corrupted is probably a bad idea, i.e. the whole
non-disruptive fix benefit is quite dubious.

The other issue is that it'd be extremely difficult to know when we could/should
remove the kill switch. It might be months or even years before anyone starts
running high volume of SEV/SEV-ES VMs with this optimization.

I can run the next version of the patch through our CI and see if it uncovers anything. I just worry about corner cases... but then that's just me.

Thanks,
Tom