Re: [PATCH v2 1/3] KVM: x86: Make the hardcoded APIC bus frequency vm variable

From: Maxim Levitsky
Date: Wed Dec 13 2023 - 17:39:55 EST


On Mon, 2023-11-13 at 20:35 -0800, isaku.yamahata@xxxxxxxxx wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
>
> TDX virtualizes the advertised APIC bus frequency to be 25MHz.

Can you explain a bit better why TDX needs this? I am not familiar
with TDX well enough yet to fully understand.

AFAIK, the guest writes the TMICT, that makes the KVM set up a HR timer,
and KVM is free to use any apic frequency to determine the deadline of that timer,
and then once the HR timer fires, KVM injects an interrupt to the guest.

Are some parts of this process overridden by the TDX?

I am sure that there is a good reason to do this, but I would be very happy
to see a detailed explanation in the changelog for future readers who
might know nothing about TDX.


> The KVM
> hardcodedes it to be 1GHz. This mismatch causes the vAPIC timer to fire
> earlier than the TDX guest expects.

Here too, what do you mean by "TDX guest expects"? Is the APIC frequency
given to the guest using some TDX specific way like HV_X64_MSR_APIC_FREQUENCY?

> In order to reconcile this mismatch,
> make the frequency configurable for the user space VMM. As the first step,
> Replace the constants with the VM value in struct kvm.



>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> ---
> Changes v2:
> - no change
> ---
> arch/x86/include/asm/kvm_host.h | 2 ++
> arch/x86/kvm/hyperv.c | 2 +-
> arch/x86/kvm/lapic.c | 6 ++++--
> arch/x86/kvm/lapic.h | 4 ++--
> arch/x86/kvm/x86.c | 2 ++
> 5 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d7036982332e..f2b1c6b3fb11 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1334,6 +1334,8 @@ struct kvm_arch {
>
> u32 default_tsc_khz;
> bool user_set_tsc;
> + u64 apic_bus_cycle_ns;
> + u64 apic_bus_frequency;
>
> seqcount_raw_spinlock_t pvclock_sc;
> bool use_master_clock;
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 238afd7335e4..995ce2c74ce0 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1687,7 +1687,7 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
> data = (u64)vcpu->arch.virtual_tsc_khz * 1000;
> break;
> case HV_X64_MSR_APIC_FREQUENCY:
> - data = APIC_BUS_FREQUENCY;
> + data = vcpu->kvm->arch.apic_bus_frequency;
> break;
> default:
> kvm_pr_unimpl_rdmsr(vcpu, msr);
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 245b20973cae..73956b0ac1f1 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1542,7 +1542,8 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic)
> remaining = 0;
>
> ns = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
> - return div64_u64(ns, (APIC_BUS_CYCLE_NS * apic->divide_count));
> + return div64_u64(ns, (apic->vcpu->kvm->arch.apic_bus_cycle_ns *
> + apic->divide_count));
> }
>
> static void __report_tpr_access(struct kvm_lapic *apic, bool write)
> @@ -1960,7 +1961,8 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)
>
> static inline u64 tmict_to_ns(struct kvm_lapic *apic, u32 tmict)
> {
> - return (u64)tmict * APIC_BUS_CYCLE_NS * (u64)apic->divide_count;
> + return (u64)tmict * apic->vcpu->kvm->arch.apic_bus_cycle_ns *
> + (u64)apic->divide_count;
> }
>
> static void update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index 0a0ea4b5dd8c..3a425ea2a515 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -16,8 +16,8 @@
> #define APIC_DEST_NOSHORT 0x0
> #define APIC_DEST_MASK 0x800
>
> -#define APIC_BUS_CYCLE_NS 1
> -#define APIC_BUS_FREQUENCY (1000000000ULL / APIC_BUS_CYCLE_NS)
> +#define APIC_BUS_CYCLE_NS_DEFAULT 1
> +#define APIC_BUS_FREQUENCY_DEFAULT (1000000000ULL / APIC_BUS_CYCLE_NS_DEFAULT)
>
> #define APIC_BROADCAST 0xFF
> #define X2APIC_BROADCAST 0xFFFFFFFFul
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2c924075f6f1..a9f4991b3e2e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12466,6 +12466,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
>
> kvm->arch.default_tsc_khz = max_tsc_khz ? : tsc_khz;
> + kvm->arch.apic_bus_cycle_ns = APIC_BUS_CYCLE_NS_DEFAULT;
> + kvm->arch.apic_bus_frequency = APIC_BUS_FREQUENCY_DEFAULT;
> kvm->arch.guest_can_read_msr_platform_info = true;
> kvm->arch.enable_pmu = enable_pmu;
>

Only one minor nitpick: We might not need 'apic_bus_frequency' and instead have
it calculated from apic_bus_cycle_ns? (to have single source of truth)

Frequency is only used by HV_X64_MSR_APIC_FREQUENCY, and I don't think that HyperV guests read
this MSR often, nor that a division will make a dent in the emulation time of this msr,
even if they do.

But if you prefer, I won't mind either.

Best regards,
Maxim Levitsky