Re: [PATCH 2/5] x86: KVM: SVM: add support for Invalid IPI Vector interception

From: Sean Christopherson
Date: Thu Sep 28 2023 - 11:46:36 EST


On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> In later revisions of AMD's APM, there is a new 'incomplete IPI' exit code:
>
> "Invalid IPI Vector - The vector for the specified IPI was set to an
> illegal value (VEC < 16)"
>
> Note that tests on Zen2 machine show that this VM exit doesn't happen and
> instead AVIC just does nothing.
>
> Add support for this exit code by doing nothing, instead of filling
> the kernel log with errors.
>
> Also replace an unthrottled 'pr_err()' if another unknown incomplete
> IPI exit happens with WARN_ON_ONCE()
>
> (e.g in case AMD adds yet another 'Invalid IPI' exit reason)
>
> Cc: <stable@xxxxxxxxxxxxxxx>
>
> Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> ---
> arch/x86/include/asm/svm.h | 1 +
> arch/x86/kvm/svm/avic.c | 5 ++++-
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 19bf955b67e0da0..3ac0ffc4f3e202b 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -268,6 +268,7 @@ enum avic_ipi_failure_cause {
> AVIC_IPI_FAILURE_TARGET_NOT_RUNNING,
> AVIC_IPI_FAILURE_INVALID_TARGET,
> AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
> + AVIC_IPI_FAILURE_INVALID_IPI_VECTOR,
> };
>
> #define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(8, 0)
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 2092db892d7d052..c44b65af494e3ff 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -529,8 +529,11 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
> case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE:
> WARN_ONCE(1, "Invalid backing page\n");
> break;
> + case AVIC_IPI_FAILURE_INVALID_IPI_VECTOR:
> + /* Invalid IPI with vector < 16 */
> + break;
> default:
> - pr_err("Unknown IPI interception\n");
> + WARN_ONCE(1, "Unknown avic incomplete IPI interception\n");

Hrm, I'm not sure KVM should WARN here. E.g. if someone runs with panic_on_warn=1,
running on new hardware might crash the host. I hope that AMD is smart enough to
make any future failure types "optional" in the sense that they're either opt-in,
or are largely informational-only (like AVIC_IPI_FAILURE_INVALID_IPI_VECTOR).

I think switching to vcpu_unimpl(), or maybe even pr_err_once(), is more appropriate.