Re: [PATCH v19 080/130] KVM: TDX: restore host xsave state when exit from the guest TD

From: Isaku Yamahata
Date: Fri Mar 08 2024 - 15:58:49 EST


On Thu, Mar 07, 2024 at 04:32:16PM +0800,
Chen Yu <yu.c.chen@xxxxxxxxx> wrote:

> On 2024-02-26 at 00:26:22 -0800, isaku.yamahata@xxxxxxxxx wrote:
> > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> >
> > On exiting from the guest TD, xsave state is clobbered. Restore xsave
> > state on TD exit.
> >
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > ---
> > v19:
> > - Add EXPORT_SYMBOL_GPL(host_xcr0)
> >
> > v15 -> v16:
> > - Added CET flag mask
> >
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > ---
> > arch/x86/kvm/vmx/tdx.c | 19 +++++++++++++++++++
> > arch/x86/kvm/x86.c | 1 +
> > 2 files changed, 20 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> > index 9616b1aab6ce..199226c6cf55 100644
> > --- a/arch/x86/kvm/vmx/tdx.c
> > +++ b/arch/x86/kvm/vmx/tdx.c
> > @@ -2,6 +2,7 @@
> > #include <linux/cpu.h>
> > #include <linux/mmu_context.h>
> >
> > +#include <asm/fpu/xcr.h>
> > #include <asm/tdx.h>
> >
> > #include "capabilities.h"
> > @@ -534,6 +535,23 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> > */
> > }
> >
> > +static void tdx_restore_host_xsave_state(struct kvm_vcpu *vcpu)
> > +{
> > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
> > +
> > + if (static_cpu_has(X86_FEATURE_XSAVE) &&
> > + host_xcr0 != (kvm_tdx->xfam & kvm_caps.supported_xcr0))
> > + xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
> > + if (static_cpu_has(X86_FEATURE_XSAVES) &&
> > + /* PT can be exposed to TD guest regardless of KVM's XSS support */
> > + host_xss != (kvm_tdx->xfam &
> > + (kvm_caps.supported_xss | XFEATURE_MASK_PT | TDX_TD_XFAM_CET)))
> > + wrmsrl(MSR_IA32_XSS, host_xss);
> > + if (static_cpu_has(X86_FEATURE_PKU) &&
> > + (kvm_tdx->xfam & XFEATURE_MASK_PKRU))
> > + write_pkru(vcpu->arch.host_pkru);
> > +}
>
> Maybe one minor question regarding the pkru restore. In the non-TDX version
> kvm_load_host_xsave_state(), it first tries to read the current setting
> vcpu->arch.pkru = rdpkru(); if this setting does not equal to host_pkru,
> it trigger the write_pkru on host. Does it mean we can also leverage that mechanism
> in TDX to avoid 1 pkru write(I guess pkru write is costly than a read pkru)?

Yes, that's the intention. When we set the PKRU feature for the guest, TDX
module unconditionally initialize pkru. Do you have use case that wrpkru()
(without rdpkru()) is better?
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>