Re: [PATCH v14 111/113] RFC: KVM: x86, TDX: Add check for setting CPUID

From: Zhi Wang
Date: Sun Jun 04 2023 - 22:25:37 EST


On Sat, 3 Jun 2023 11:02:35 -0700
Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote:

> On Sat, Jun 03, 2023 at 09:29:33AM +0800,
> Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:
>
> > On Sun, 28 May 2023 21:20:33 -0700
> > isaku.yamahata@xxxxxxxxx wrote:
> >
> > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > >
> > > Add a hook to setting CPUID for additional consistency check of
> > > KVM_SET_CPUID2.
> > >
> > > Because intel TDX or AMD SNP has restriction on the value of cpuid. Some
> > > value can't be changed after boot. Some can be only set at the VM
> > > creation time and can't be changed later. Check if the new values are
> > > consistent with the old values.
> > >
> >
> > Thanks for addressing this from the discussion. The structure looks good to me.
> > I was thinking if the patch should be separated into two parts. One is the
> > common part so that AMD folks can include it in their patch series. Another one
> > is TDX callback which builds on top of the common-part patch.
>
> OK.
>
> > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > > Link: https://lore.kernel.org/lkml/ZDiGpCkXOcCm074O@xxxxxxxxxx/
> > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > > ---
> > > arch/x86/include/asm/kvm-x86-ops.h | 1 +
> > > arch/x86/include/asm/kvm_host.h | 1 +
> > > arch/x86/kvm/cpuid.c | 10 ++++++
> > > arch/x86/kvm/cpuid.h | 2 ++
> > > arch/x86/kvm/vmx/main.c | 10 ++++++
> > > arch/x86/kvm/vmx/tdx.c | 57 ++++++++++++++++++++++++++++--
> > > arch/x86/kvm/vmx/tdx.h | 7 ++++
> > > arch/x86/kvm/vmx/x86_ops.h | 4 +++
> > > 8 files changed, 90 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> > > index c1a4d29cf4fa..5faa13a31f59 100644
> > > --- a/arch/x86/include/asm/kvm-x86-ops.h
> > > +++ b/arch/x86/include/asm/kvm-x86-ops.h
> > > @@ -20,6 +20,7 @@ KVM_X86_OP(hardware_disable)
> > > KVM_X86_OP(hardware_unsetup)
> > > KVM_X86_OP_OPTIONAL_RET0(offline_cpu)
> > > KVM_X86_OP(has_emulated_msr)
> > > +KVM_X86_OP_OPTIONAL_RET0(vcpu_check_cpuid)
> > > KVM_X86_OP(vcpu_after_set_cpuid)
> > > KVM_X86_OP(is_vm_type_supported)
> > > KVM_X86_OP_OPTIONAL(max_vcpus);
> > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > index 68ddb0da31e0..4efd9770963c 100644
> > > --- a/arch/x86/include/asm/kvm_host.h
> > > +++ b/arch/x86/include/asm/kvm_host.h
> > > @@ -1588,6 +1588,7 @@ struct kvm_x86_ops {
> > > void (*hardware_unsetup)(void);
> > > int (*offline_cpu)(void);
> > > bool (*has_emulated_msr)(struct kvm *kvm, u32 index);
> > > + int (*vcpu_check_cpuid)(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, int nent);
> > > void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu);
> > >
> > > bool (*is_vm_type_supported)(unsigned long vm_type);
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index 0142a73034c4..ef7c361883d7 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -414,6 +414,9 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
> > > }
> > >
> > > r = kvm_check_cpuid(vcpu, e2, nent);
> > > + if (r)
> > > + return r;
> > > + r = static_call(kvm_x86_vcpu_check_cpuid)(vcpu, e2, nent);
> > > if (r)
> > > return r;
> >
> > It would be nice to move the static_call into the kvm_check_cpuid() as it is
> > part of the process of checking cpuid. It is good enough for now as
> > kvm_check_cpuid() only has one caller. Think if more caller of
> > kvm_check_cpuid() shows up in the future, they need to move it into
> > kvm_check_cpuid anyway.
> >
> > >
> > > @@ -1364,6 +1367,13 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid,
> > > return r;
> > > }
> > >
> > > +struct kvm_cpuid_entry2 *__kvm_find_cpuid_entry2(
> > > + struct kvm_cpuid_entry2 *entries, int nent, u32 function, u64 index)
> > > +{
> > > + return cpuid_entry2_find(entries, nent, function, index);
> > > +}
> > > +EXPORT_SYMBOL_GPL(__kvm_find_cpuid_entry2);
> > > +
> >
> > If evetually, we have to open kvm_cpuid2 when searching the cpuid entries,
> > I would suggest to open it in kvm_find_cpuid_entry2() instead of introducing
> > a new __kvm_find_cpuid_entry2(). It would be nice to let kvm_find_cpuid_entry2
> > () to take entreis and nent in the previou patch.
>
> Makes sense. Consolidated kvm_find_cpuid_entry2(
> struct kvm_cpuid_entry2 *entries, int nent, u32 function, u64 index).
>
> > > struct kvm_cpuid_entry2 *kvm_find_cpuid_entry2( struct kvm_cpuid2 *cpuid,
> > > u32 function, u32 index)
> > > {
>
> ... snip...
>
> > > +int tdx_vcpu_check_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, int nent)
> > > +{
> > > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
> > > + const struct tdsysinfo_struct *tdsysinfo;
> > > + int i;
> > > +
> > > + tdsysinfo = tdx_get_sysinfo();
> > > + if (!tdsysinfo)
> > > + return -ENOTSUPP;
> > > +
> > > + /*
> > > + * Simple check that new cpuid is consistent with created one.
> > > + * For simplicity, only trivial check. Don't try comprehensive checks
> > > + * with the cpuid virtualization table in the TDX module spec.
> > > + */
> > > + for (i = 0; i < tdsysinfo->num_cpuid_config; i++) {
> > > + const struct tdx_cpuid_config *config = &tdsysinfo->cpuid_configs[i];
> > > + u32 index = config->sub_leaf == TDX_CPUID_NO_SUBLEAF ? 0: config->sub_leaf;
> > > + const struct kvm_cpuid_entry2 *old =
> > > + __kvm_find_cpuid_entry2(kvm_tdx->cpuid, kvm_tdx->cpuid_nent,
> > > + config->leaf, index);
> > > + const struct kvm_cpuid_entry2 *new =
> > > + __kvm_find_cpuid_entry2(e2, nent, config->leaf, index);
> > > +
> > > + if (!!old != !!new)
> > > + return -EINVAL;
> > > + if (!old && !new)
> > > + continue;
> > > +
> > > + if ((old->eax ^ new->eax) & config->eax ||
> > > + (old->ebx ^ new->ebx) & config->ebx ||
> > > + (old->ecx ^ new->ecx) & config->ecx ||
> > > + (old->edx ^ new->edx) & config->edx)
> > > + return -EINVAL;
> > > + }
> > > + return 0;
> > > +}
> >
> > Guess checkpatch.pl will complain about the length of lines above.
>
> By default, the line is 100 chars. Not 80.
>

The reason I raised this up is because: It seems there are two different rules
going on in this patch. One seems respecting the 80-columns limitation. E.g.

struct kvm_cpuid_entry2 *kvm_find_cpuid_entry2( struct kvm_cpuid2 *cpuid,
u32 function, u32 index)

While this one respects a 100-columns rule different than others.

int tdx_vcpu_check_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, int nent)

Note that 80-columns rules are still in the kernel coding style guide[1].

It would be better to follow it.

[1] https://docs.kernel.org/process/coding-style.html

>
> > > +
> > > void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > > {
> > > struct vcpu_tdx *tdx = to_tdx(vcpu);
> > > @@ -2003,10 +2044,12 @@ static void setup_tdparams_eptp_controls(struct kvm_cpuid2 *cpuid, struct td_par
> > > }
> > > }
> > >
> > > -static void setup_tdparams_cpuids(const struct tdsysinfo_struct *tdsysinfo,
> > > +static void setup_tdparams_cpuids(struct kvm *kvm,
> > > + const struct tdsysinfo_struct *tdsysinfo,
> > > struct kvm_cpuid2 *cpuid,
> > > struct td_params *td_params)
> > > {
> > > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> > > int i;
> > >
> > > /*
> > > @@ -2014,6 +2057,7 @@ static void setup_tdparams_cpuids(const struct tdsysinfo_struct *tdsysinfo,
> > > * be same to the one of struct tdsysinfo.{num_cpuid_config, cpuid_configs}
> > > * It's assumed that td_params was zeroed.
> > > */
> > > + kvm_tdx->cpuid_nent = 0;
> > > for (i = 0; i < tdsysinfo->num_cpuid_config; i++) {
> > > const struct tdx_cpuid_config *config = &tdsysinfo->cpuid_configs[i];
> > > /* TDX_CPUID_NO_SUBLEAF in TDX CPUID_CONFIG means index = 0. */
> > > @@ -2035,6 +2079,10 @@ static void setup_tdparams_cpuids(const struct tdsysinfo_struct *tdsysinfo,
> > > value->ebx = entry->ebx & config->ebx;
> > > value->ecx = entry->ecx & config->ecx;
> > > value->edx = entry->edx & config->edx;
> > > +
> > > + /* Remember the setting to check for KVM_SET_CPUID2. */
> > > + kvm_tdx->cpuid[kvm_tdx->cpuid_nent] = *entry;
> > > + kvm_tdx->cpuid_nent++;
> > > }
> > > }
> > >
> > > @@ -2130,7 +2178,7 @@ static int setup_tdparams(struct kvm *kvm, struct td_params *td_params,
> > > td_params->tsc_frequency = TDX_TSC_KHZ_TO_25MHZ(kvm->arch.default_tsc_khz);
> > >
> > > setup_tdparams_eptp_controls(cpuid, td_params);
> > > - setup_tdparams_cpuids(tdsysinfo, cpuid, td_params);
> > > + setup_tdparams_cpuids(kvm, tdsysinfo, cpuid, td_params);
> > > ret = setup_tdparams_xfam(cpuid, td_params);
> > > if (ret)
> > > return ret;
> > > @@ -2332,6 +2380,11 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
> > > if (cmd->flags)
> > > return -EINVAL;
> > >
> > > + kvm_tdx->cpuid = kzalloc(sizeof(init_vm->cpuid.entries[0]) * KVM_MAX_CPUID_ENTRIES,
> > > + GFP_KERNEL);
> > > + if (!kvm_tdx->cpuid)
> > > + return -ENOMEM;
> > > +
> > > init_vm = kzalloc(sizeof(*init_vm) +
> > > sizeof(init_vm->cpuid.entries[0]) * KVM_MAX_CPUID_ENTRIES,
> > > GFP_KERNEL);
> >
> > kfree(kvm_tdx->cpuid) is required in the error handling path of tdx_td_init().
>
>
> No need. tdx_vm_free() frees it. Not here.

From a perspective of correctness, Yes. But there seems different kinds of
strategies of function error handling path going on in this patch series.
Taking __tdx_td_init() as an example, tdx_vm_free() is immediately called
in its error handling path. But, the error handling below the allocation
of cpuid will be deferred to the late VM teardown path in this patch. I am
confused of the strategy of common error handling path, what is the
preferred error handling strategy? Deferring or immediate handling?

Second, it is not graceful to defer the error handling to the teardown
path all the time even they seem work: 1) Readability. It looks like a
problematic error handling at a first glance. 2) Error handling path and
teardown path are for different purposes. Separating the concerns brings
clearer code organization and better maintainability. 3) Testability,
thinking about an error injection test for tdx_td_init(). Un-freed
kvm_tdx->cpuid will look like a memory leaking in this case and needs to
be noted as "free is in another function". It just makes the case more
complicated.

Third, I believe a static code scanner will complain about it.