Re: [PATCH] KVM: x86: Increase KVM_MAX_VCPUS to 4096

From: Sean Christopherson
Date: Tue Aug 15 2023 - 13:12:01 EST


On Tue, Aug 15, 2023, Kyle Meyer wrote:
> Increase KVM_MAX_VCPUS to 4096 when MAXSMP is enabled.
>
> Notable changes (when MAXSMP is enabled):
>
> * KMV_MAX_VCPUS will increase from 1024 to 4096.
> * KVM_MAX_VCPU_IDS will increase from 4096 to 16384.
> * KVM_HV_MAX_SPARSE_VCPU_SET_BITS will increase from 16 to 64.
> * CPUID[HYPERV_CPUID_IMPLEMENT_LIMITS (0x40000005)].EAX will now be 4096.
>
> * struct kvm will increase from 39408 B to 39792 B.
> * struct kvm_ioapic will increase from 5240 B to 19064 B.
>
> * The following (on-stack) bitmaps will increase from 128 B to 512 B:
> * dest_vcpu_bitmap in kvm_irq_delivery_to_apic.
> * vcpu_mask in kvm_hv_flush_tlb.
> * vcpu_bitmap in ioapic_write_indirect.
> * vp_bitmap in sparse_set_to_vcpu_mask.
>
> Signed-off-by: Kyle Meyer <kyle.meyer@xxxxxxx>
> ---
> Virtual machines with 4096 virtual CPUs have been created on 32 socket
> Cascade Lake and Sapphire Rapids systems.
>
> 4096 is the current maximum value because of the Hyper-V TLFS. See
> BUILD_BUG_ON in arch/x86/kvm/hyperv.c, commit 79661c3, and Vitaly's
> comment on https://lore.kernel.org/all/87r136shcc.fsf@xxxxxxxxxx.

Mostly out of curiosity, do you care about Hyper-V support? If not, at some
point it'd probably be worth exploring a CONFIG_KVM_HYPERV option to allow
disabling KVM's Hyper-V support at compile time so that we're not bound by the
restrictions of the TLFS.

> arch/x86/include/asm/kvm_host.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3bc146dfd38d..91a01fa17fa7 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -39,7 +39,11 @@
>
> #define __KVM_HAVE_ARCH_VCPU_DEBUGFS
>
> +#ifdef CONFIG_MAXSMP
> +#define KVM_MAX_VCPUS 4096
> +#else
> #define KVM_MAX_VCPUS 1024
> +#endif

Rather than tightly couple this to MAXSMP, what if we add a Kconfig? I know of
at least one scenario, SVM's AVIC/x2AVIC, where it would be desirable to configure
KVM to a much smaller maximum. The biggest downside I can think of is that KVM
selftests would need to be updated (they assume the max is >=512), and some of the
tests might be completely invalid if KVM_MAX_VCPUS is too low (<256?).

E.g.

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 60d430b4650f..8704748e35d9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -39,7 +39,7 @@

#define __KVM_HAVE_ARCH_VCPU_DEBUGFS

-#define KVM_MAX_VCPUS 1024
+#define KVM_MAX_VCPUS CONFIG_KVM_MAX_NR_VCPUS

/*
* In x86, the VCPU ID corresponds to the APIC ID, and APIC IDs
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index ed90f148140d..b0f92eb77f78 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -151,6 +151,17 @@ config KVM_PROVE_MMU

If in doubt, say "N".

+config KVM_MAX_NR_VCPUS
+ int "Maximum vCPUs per VM"
+ default "4096" if MAXSMP
+ default "1024"
+ range 1 4096
+ depends on KVM
+ help
+ Set the maximum number of vCPUs for a single VM. Larger values
+ increase the memory footprint of each VM regardless of how many vCPUs
+ are actually created (though the memory increase is relatively small).
+
config KVM_EXTERNAL_WRITE_TRACKING
bool