Re: [PATCH v5 5/7] arm64: KVM: Add interface to set guest value for TRFCR register

From: James Clark
Date: Fri Feb 23 2024 - 11:40:11 EST




On 23/02/2024 10:03, Suzuki K Poulose wrote:
> On 20/02/2024 10:09, James Clark wrote:
>> Add an interface for the Coresight driver to use to set the value of the
>> TRFCR register for the guest. This register controls the exclude
>> settings for trace at different exception levels, and is used to honor
>> the exclude_host and exclude_guest parameters from the Perf session.
>> This will be used to later write TRFCR_EL1 on nVHE at guest switch. For
>> VHE, the host trace is controlled by TRFCR_EL2 and thus we can write to
>> the TRFCR_EL1 immediately. Because guest writes to the register are
>> trapped, the value will persist and can't be modified.
>>
>> Instead of adding a load of infrastructure to share the host's per-cpu
>> offsets with the hypervisor, just define the new storage as a NR_CPUS
>> array.
>>
>> Signed-off-by: James Clark <james.clark@xxxxxxx>
>> ---
>>   arch/arm64/include/asm/kvm_host.h |  3 +++
>>   arch/arm64/kernel/image-vars.h    |  1 +
>>   arch/arm64/kvm/debug.c            | 26 ++++++++++++++++++++++++++
>>   3 files changed, 30 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h
>> b/arch/arm64/include/asm/kvm_host.h
>> index 85b5477bd1b4..56b7f7eca195 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -509,6 +509,7 @@ struct kvm_host_psci_config {
>>       bool psci_0_1_cpu_off_implemented;
>>       bool psci_0_1_migrate_implemented;
>>   };
>> +extern u64 ____cacheline_aligned kvm_guest_trfcr[NR_CPUS];
>>     extern struct kvm_host_psci_config
>> kvm_nvhe_sym(kvm_host_psci_config);
>>   #define kvm_host_psci_config CHOOSE_NVHE_SYM(kvm_host_psci_config)
>> @@ -1174,6 +1175,7 @@ void kvm_arch_vcpu_put_debug_state_flags(struct
>> kvm_vcpu *vcpu);
>>   void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
>>   void kvm_clr_pmu_events(u32 clr);
>>   bool kvm_set_pmuserenr(u64 val);
>> +void kvm_etm_set_guest_trfcr(u64 trfcr_guest);
>>   #else
>>   static inline void kvm_set_pmu_events(u32 set, struct
>> perf_event_attr *attr) {}
>>   static inline void kvm_clr_pmu_events(u32 clr) {}
>> @@ -1181,6 +1183,7 @@ static inline bool kvm_set_pmuserenr(u64 val)
>>   {
>>       return false;
>>   }
>> +static inline void kvm_etm_set_guest_trfcr(u64 trfcr_guest) {}
>>   #endif
>>     void kvm_vcpu_load_vhe(struct kvm_vcpu *vcpu);
>> diff --git a/arch/arm64/kernel/image-vars.h
>> b/arch/arm64/kernel/image-vars.h
>> index 5e4dc72ab1bd..a451e4f10804 100644
>> --- a/arch/arm64/kernel/image-vars.h
>> +++ b/arch/arm64/kernel/image-vars.h
>> @@ -59,6 +59,7 @@ KVM_NVHE_ALIAS(alt_cb_patch_nops);
>>     /* Global kernel state accessed by nVHE hyp code. */
>>   KVM_NVHE_ALIAS(kvm_vgic_global_state);
>> +KVM_NVHE_ALIAS(kvm_guest_trfcr);
>>     /* Kernel symbols used to call panic() from nVHE hyp code (via
>> ERET). */
>>   KVM_NVHE_ALIAS(nvhe_hyp_panic_handler);
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 49a13e72ddd2..c8d936ce6e2b 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -22,6 +22,7 @@
>>                   DBG_MDSCR_MDE)
>>     static DEFINE_PER_CPU(u64, mdcr_el2);
>> +u64 ____cacheline_aligned kvm_guest_trfcr[NR_CPUS];
>>     /*
>>    * save/restore_guest_debug_regs
>> @@ -359,3 +360,28 @@ void kvm_arch_vcpu_put_debug_state_flags(struct
>> kvm_vcpu *vcpu)
>>       vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
>>       vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRFCR);
>>   }
>> +
>> +/*
>> + * Interface for the Coresight driver to use to set the value of the
>> TRFCR
>> + * register for the guest. This register controls the exclude
>> settings for trace
>> + * at different exception levels, and is used to honor the
>> exclude_host and
>> + * exclude_guest parameters from the Perf session.
>> + *
>> + * This will be used to later write TRFCR_EL1 on nVHE at guest
>> switch. For VHE,
>> + * the host trace is controlled by TRFCR_EL2 and thus we can write to
>> the
>> + * TRFCR_EL1 immediately. Because guest writes to the register are
>> trapped, the
>> + * value will persist and can't be modified. For pKVM,
>> kvm_guest_trfcr can't
>> + * be read by the hypervisor, so don't bother writing it.
>> + */
>> +void kvm_etm_set_guest_trfcr(u64 trfcr_guest)
>> +{
>> +    if
>> (WARN_ON_ONCE(!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
>> +                                   ID_AA64DFR0_EL1_TraceFilt_SHIFT)))
>> +        return;
>> +
>> +    if (has_vhe())
>> +        write_sysreg_s(trfcr_guest, SYS_TRFCR_EL12);
>> +    else if (!is_protected_kvm_enabled())
>> +        kvm_guest_trfcr[smp_processor_id()] = trfcr_guest;
>
> smp_processor_id() could sleep in some configurations ? Should we switch
> to raw_smp_processor_id() to be safer ?
>

I don't think so, it's #defined to raw_smp_processor_id() anyway. Unless
DEBUG_PREEMPT is on, then it's still raw_smp_processor_id() but it
validates that preemption is disabled so the value isn't stale.

We actually want that validation, so should leave it as
smp_processor_id(). I can add a comment saying that this function should
only be called with preemption disabled, but I wouldn't add any extra
validation. Every smp_processor_id() call is already checked when
DEBUG_PREEMPT is on and this one doesn't seem to be special in any way.

I also checked that the warning isn't triggered with DEBUG_PREEMPT on,
and there are also a lot of other smp_processor_id() calls on similar
paths in the Coresight driver.

> Otherwise looks good to me.
>
> Suzuki
>
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_etm_set_guest_trfcr);
>