Re: [PATCH v9 09/12] KVM: x86: Introduce a function to initialize the PT configuration

From: Alexander Shishkin
Date: Thu Jun 07 2018 - 09:57:03 EST


On Tue, May 22, 2018 at 12:52:12PM +0800, Luwei Kang wrote:
> Initialize the Intel PT configuration when cpuid update.

Is it the CPUID configuration? Is it the MSR configuration? Is it both?
Kind of looks like both. Not sure what is the cpuid update, though.

> Include cpuid inforamtion, rtit_ctl bit mask and the number of
> address ranges.
>
> Signed-off-by: Luwei Kang <luwei.kang@xxxxxxxxx>
> ---
> arch/x86/kvm/vmx.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 70 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 11fb90a..952ddf4 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -10411,6 +10411,72 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
> #undef cr4_fixed1_update
> }
>
> +static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_vmx *vmx = to_vmx(vcpu);
> + struct kvm_cpuid_entry2 *best = NULL;
> + int i;
> +
> + for (i = 0; i < PT_CPUID_LEAVES; i++) {
> + best = kvm_find_cpuid_entry(vcpu, 0x14, i);
> + if (!best)
> + return;
> + vmx->pt_desc.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM] = best->eax;
> + vmx->pt_desc.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM] = best->ebx;
> + vmx->pt_desc.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM] = best->ecx;
> + vmx->pt_desc.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM] = best->edx;
> + }
> +
> + /* Get the number of configurable Address Ranges for filtering */
> + vmx->pt_desc.addr_range = pt_cap_decode(vmx->pt_desc.caps,
> + PT_CAP_num_address_ranges);
> +
> + /* Initialize and clear the no dependency bits */
> + vmx->pt_desc.ctl_bitmask = ~0ULL;

This looks redundant, doesn't it?

> + vmx->pt_desc.ctl_bitmask = ~(RTIT_CTL_TRACEEN | RTIT_CTL_OS |
> + RTIT_CTL_USR | RTIT_CTL_TSC_EN | RTIT_CTL_DISRETC);
> +
> + /* If CPUID.(EAX=14H,ECX=0):EBX[0]=1 CR3Filter can be set */

This comment makes it less clear than it would have been otherwise.

> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
> + vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_CR3EN;
> +
> + /*
> + * If CPUID.(EAX=14H,ECX=0):EBX[1]=1 CYCEn, CycThresh and
> + * PSBFreq can be set
> + */
> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_psb_cyc))
> + vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_CYCLEACC |
> + RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ);
> +
> + /*
> + * If CPUID.(EAX=14H,ECX=0):EBX[3]=1 MTCEn BranchEn and
> + * MTCFreq can be set
> + */
> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_mtc))
> + vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_MTC_EN |
> + RTIT_CTL_BRANCH_EN | RTIT_CTL_MTC_RANGE);
> +
> + /* If CPUID.(EAX=14H,ECX=0):EBX[4]=1 FUPonPTW and PTWEn can be set */
> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_ptwrite))
> + vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_FUP_ON_PTW |
> + RTIT_CTL_PTW_EN);
> +
> + /* If CPUID.(EAX=14H,ECX=0):EBX[5]=1 PwrEvEn can be set */
> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_power_event_trace))
> + vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_PWR_EVT_EN;
> +
> + /* If CPUID.(EAX=14H,ECX=0):ECX[0]=1 ToPA can be set */
> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_topa_output))
> + vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_TOPA;

If you want to be thorough, there's also PT_CAP_single_range_output, which
tells us if RTIT_CTL_TOPA can be *unset*. Otherwise it's required.

> + /* If CPUID.(EAX=14H,ECX=0):ECX[3]=1 FabircEn can be set */
> + if (pt_cap_decode(vmx->pt_desc.caps, PT_CAP_output_subsys))
> + vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_FABRIC_EN;

Are we sure we want to virtualize this and that it's safe?

> +
> + /* unmask address range configure area */
> + for (i = 0; i < vmx->pt_desc.addr_range; i++)
> + vmx->pt_desc.ctl_bitmask &= ~(0xf << (32 + i * 4));

So, the ctl_bitmask is all the bits that are not allowed?

Regards,
--
Alex