Re: [PATCH kernel v2] x86/compressed/64: reduce #VC nesting for intercepted CPUID for SEV-SNP guest

From: Borislav Petkov
Date: Sat Sep 30 2023 - 03:18:22 EST


On Tue, Sep 26, 2023 at 02:05:26PM +1000, Alexey Kardashevskiy wrote:
> vc_raw_handle_exception #1: exit_code 72 (CPUID) eax d ecx 1
> We lock the main GHCB and while it is locked we get to

Please use passive voice in your commit message: no "we" or "I", etc,
and describe your changes in imperative mood.

Also, pls read section "2) Describe your changes" in
Documentation/process/submitting-patches.rst for more details.

Also, see section "Changelog" in
Documentation/process/maintainer-tip.rst

Bottom line is: personal pronouns are ambiguous in text, especially with
so many parties/companies/etc developing the kernel so let's avoid them
please.

> snp_cpuid_postprocess() which executes "rdmsr" of MSR_IA32_XSS==0xda0 which
> triggers:
>
> vc_raw_handle_exception #2: exit_code 7c (MSR) ecx da0
> Here we lock the backup ghcb.
>
> And then PMC NMI comes which cannot complete as there is no GHCB page left
> to use:
>
> CPU: 5 PID: 566 Comm: touch Not tainted 6.5.0-rc2-aik-ad9c-g7413e71d3dcf-dirty #27
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown unknown
> Call Trace:
> <NMI>
> dump_stack_lvl+0x44/0x60
> panic+0x222/0x310
> ____sev_get_ghcb+0x21e/0x220
> __sev_es_nmi_complete+0x28/0xf0
> exc_nmi+0x1ac/0x1c0
> end_repeat_nmi+0x16/0x67
> ...
> </NMI>
> <TASK>
> vc_raw_handle_exception+0x9e/0x2c0
> kernel_exc_vmm_communication+0x4d/0xa0
> asm_exc_vmm_communication+0x31/0x60
> RIP: 0010:snp_cpuid+0x2ad/0x420

Drop that splat like we talked.

> +/* Paravirt SEV-ES rdmsr which avoids extra #VC event */
> +#define rdmsr_safe_GHCB(msr, low, high, ghcb, ctxt) ({ \
> + int __ret; \
> + \
> + ghcb_set_rcx((ghcb), (msr)); \
> + __ret = sev_es_ghcb_hv_call((ghcb), (ctxt), SVM_EXIT_MSR, 0, 0); \
> + if (__ret == ES_OK) { \
> + low = (ghcb)->save.rax; \
> + high = (ghcb)->save.rdx; \
> + /* Invalidate qwords for likely another following GHCB call */ \
> + vc_ghcb_invalidate(ghcb); \
> + } \
> + __ret; })
> +

First of all, this should be a function, not a macro.

Then, it should be defined only in sev-shared.c for now.

Furthermore, it should not be called "rdmsr" or so but something like

ghcb_prot_read_msr()

or so to denote that it is using the GHCB protocol to read the MSR. I'm
sure it'll gain more users with time.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette