Re: [PATCH v9 4/6] KVM: x86: Introduce untag_addr() in kvm_x86_ops

From: Sean Christopherson
Date: Thu Jun 29 2023 - 11:34:04 EST


On Thu, Jun 29, 2023, Binbin Wu wrote:
> On 6/29/2023 2:57 PM, Chao Gao wrote:
> > On Thu, Jun 29, 2023 at 02:12:27PM +0800, Binbin Wu wrote:
> > > > > + /*
> > > > > + * Check LAM_U48 in cr3_ctrl_bits to avoid guest_cpuid_has().
> > > > > + * If not set, vCPU doesn't supports LAM.
> > > > > + */
> > > > > + if (!(vcpu->arch.cr3_ctrl_bits & X86_CR3_LAM_U48) ||
> > > > This is unnecessary, KVM should never allow the LAM bits in CR3 to be set if LAM
> > > > isn't supported.
> > A corner case is:
> >
> > If EPT is enabled, CR3 writes are not trapped. then guests can set the
> > LAM bits in CR3 if hardware supports LAM regardless whether or not guest
> > enumerates LAM.

Argh, that's a really obnoxious virtualization hole.

> I recalled the main reason why I added the check.
> It's used to avoid the following checking on CR3 & CR4, which may cause an
> additional VMREAD.

FWIW, that will (and should) be handled by kvm_get_active_lam_bits(). Hmm, though
since CR4.LAM_SUP is a separate thing, that should probably be
kvm_get_active_cr3_lam_bits().

> Also, about the virtualization hole, if guest can enable LAM bits in CR3 in
> non-root mode without cause any problem, that means the hardware supports
> LAM, should KVM continue to untag the address following CR3 setting?

Hrm, no, KVM should honor the architecture. The virtualization hole is bad enough
as it is, I don't want to KVM to actively make it worse.

> Because skip untag the address probably will cause guest failure, and of
> cause, this is the guest itself to blame.

Yeah, guest's fault. The fact that it the guest won't get all the #GPs it should
is unfortunate, but intercepting all writes to CR3 just to close the hole is sadly
a really bad tradeoff.

> But untag the address seems do no harm?

In an of itself, not really. But I don't want to set the precedent in KVM that
user LAM is supported regardless of guest CPUID.

Another problem with the virtualization hole is that the guest will be able
to induce VM-Fail when KVM is running on L1, because L0 will likely enforce the
CR3 checks on VM-Enter but not intercept MOV CR3. I.e. the guest can get an
illegal value into vmcs.GUEST_CR3. We could add code to explicitly detect that
case to help triage such failures, but I don't know that it's worth the code, e.g.

if (exit_reason.failed_vmentry) {
if (boot_cpu_has(X86_FEATURE_LAM) &&
!guest_can_use(X86_FEATURE_LAM) &&
(kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
pr_warn_ratelimited("Guest abused LAM virtualization hole\n");
else
dump_vmcs(vcpu);
vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
vcpu->run->fail_entry.hardware_entry_failure_reason
= exit_reason.full;
vcpu->run->fail_entry.cpu = vcpu->arch.last_vmentry_cpu;
return 0;
}