Re: AMD Memory encryption vs. kexec

From: Dave Hansen
Date: Wed Nov 29 2023 - 15:02:16 EST


On 11/28/23 06:03, Tom Lendacky wrote:
...
>> By my reading, the CC_ATTR_HOST_MEM_ENCRYPT is basically a check for
>> whether the current kernel has enabled SME but not SEV while the
>> stop_this_cpu() site is driven purely by whether the hardware *supports*
>> SME.
>>
>> The whole supposed reason stop_this_cpu() checks CPUID directly is that
>> the current kernel SME/SEV enabling might not match the _next_ kernel's
>> enabling choices.
>
> Correct.
>
>> So, why is a _current_ kernel check OK for relocate_kernel(), but not OK
>> for stop_this_cpu()?
>
> The relocate_kernel() check provides an indication of whether SME is
> actually active. The kexec kernel is placed in unencrypted memory to
> match how the system was booted - where the kernel is loaded into
> unencrypted memory and then encrypted in-place if SME is desired
> (mem_encrypt=on). Since the kexec kernel will be unencrypted, the
> cc_platform_has() call is used to indicate whether to perform a wbinvd
> to remove encrypted cache line entries. If SME is not active, then there
> is no need to flush caches prior to booting the kexec kernel.

Ahh, so that wbinvd is truly specific to kexec. It protects the
always-unencrypted kexec area from being zapped by encrypted lines. It
isn't necessary when the old kexec kernel is mem_encrypt=off because the
unencrypted old kernel matches the always unencrypted kexec area.

What I was worried about was the _larger_ case. Not the kexec area, the
*rest* of memory. But I think that's irrelevant because there's yet
*another* wbinvd in __enc_copy() that is will flush the rest of memory
when going from mem_encrypt=off=>on.

I'd like to propose a simplification. Let's add a
CC_ATTR_HOST_MEM_INCOHERENT. That bit gets set on all hardware that
needs WBVINDs at kexec. On AMD, it can use the stop_this_cpu() logic.
This will cause an additional wbinvd in case where a mem_encrypt=off
kernel is kexec'ing.

We can also set it on any TDX-enabled Intel hardware.

That leads to very simple logic at kexec:

Could the old kernel leave incoherent caches
around? If so, do WBINVD.

That logic gets applied to all CPUs, both boot and secondary. It
applies to all the SME-only systems (currently CC_ATTR_HOST_MEM_ENCRYPT)
and also all TDX systems. It would not depend on the current kernel's
SME enabling and it would allow both kexec-related sites to share the
same logic.

I don't really like the idea of yet another CC_ATTR_HOST_MEM_INCOHERENT
bit, but I do think it's better than adding some TDX-specific paths.