Re: [PATCHv2 2/3] x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()

From: Tom Lendacky
Date: Fri Jun 02 2023 - 13:05:51 EST

Next message: Mark Brown: "Re: [PATCH] ASoC: fsl_sai: Enable BCI bit if SAI works on synchronous mode with BYP asserted"
Previous message: Linus Torvalds: "Re: Bug in short splice to socket?"
In reply to: Michael Kelley (LINUX): "RE: [PATCHv2 2/3] x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()"
Next in thread: Dave Hansen: "Re: [PATCHv2 2/3] x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 6/2/23 11:11, Michael Kelley (LINUX) wrote:

From: Tom Lendacky <thomas.lendacky@xxxxxxx> Sent: Thursday, June 1, 2023 11:19 AM

On 5/31/23 15:00, Michael Kelley (LINUX) wrote:

From: Sathyanarayanan Kuppuswamy

<sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>

Sent: Tuesday, May 30, 2023 6:22 AM

Hi,

On 5/30/23 5:57 AM, Tom Lendacky wrote:

On 5/29/23 19:57, Kirill A. Shutemov wrote:

On Fri, May 26, 2023 at 03:10:56PM -0700, Sathyanarayanan Kuppuswamy

wrote:

On 5/26/23 5:02 AM, Kirill A. Shutemov wrote:

Touching privately mapped GPA that is not properly converted to private
with MapGPA and accepted leads to unrecoverable exit to VMM.

load_unaligned_zeropad() can touch memory that is not owned by the
caller, but just happened to next after the owned memory.

/s/to/to be ?

Yep, my bad.

This load_unaligned_zeropad() behaviour makes it important when kernel
asks VMM to convert a GPA from shared to private or back. Kernel must
never have a page mapped into direct mapping (and aliases) as private
when the GPA is already converted to shared or when GPA is not yet
converted to private.

I am wondering whether this issue exist in the AMD code?

IMO, you can add some info on the window in set_memory_encrypted()
where this race exists.

I don't think AMD affected by load_unaligned_zeropad() the same way as
Intel does. But I'm not sure.

Tom, do you have any comments?

Right, shouldn't be an issue for SNP.

Thanks for confirming.

Tom -- For my education, could you elaborate on why this problem can't
occur in an SEV-SNP guest? There's still a window where the direct map
PTE and the RMP as maintained by the hypervisor are out-of-sync. If
load_unaligned_zeropad() does a read using the direct map PTE during
this out-of-sync window, isn't that going to trap to the hypervisor? How
is the scenario is handled from there to provide the zeros to
load_unaligned_zeropad()? I need to make sure Hyper-V is doing whatever
is needed. :-)

Ah, I think I misunderstood this when it was being talked about. The issue
SNP would have would be between setting the c-bit but before the PVALIDATE
is issued. Prior to the RMP being updated, referencing the page will
generate an #NPF and automatically change the RMP over to private (in
KVM). However, after the guest is resumed, the page will not have been
validated resulting in a #VC with error code 0x404 being generated,
causing the guest to terminate itself.

I suppose, when a 0x404 error code is encountered by the #VC handler, it
could call search_exception_tables() and call ex_handler_zeropad() for the
EX_TYPE_ZEROPAD type (ex_handler_zeropad is currently static, though).

Tom -- Does the above sequence *depend* on the hypervisor doing anything
to make it work? I'm not clear on why KVM would automatically change the
page over to private. If there's a dependency on the hypervisor doing
something, then it seems like we'll need to standardize that "something"
across hypervisors, lest we end up with per-hypervisor code in Linux to handle
this scenario. And running SEV-SNP with multiple VMPLs probably makes it
even more complicated.

No, it doesn't depend on the hypervisor doing anything. If the RMP hasn't been updated, then a #VMEXIT(NPF) will be triggered (see APM vol 2, Table 15-39). The hypervisor is free to do what it wants with that, e.g. just resume the guest (and immediately take another #VMEXIT(NPF), possibly). Since it is a different thread/vCPU trying to access the memory than is changing the state of the memory, eventually the guest kernel thread that is changing the state of the memory will issue the page state change to the hypervisor and the other thread can continue.

Thanks,
Tom

Kirill -- Same question about TDX. Does making load_unaligned_zeropad()
work in a TDX VM depend on the hypervisor doing anything? Or is the
behavior seen by the guest dependent only on architected behavior of
the TDX processor?

Looking at this problem from a slightly higher level, and thinking out loud
a bit, load_unaligned_zeropad() functionality is provided only for certain
architectures: x86/64, arm, arm64, and PowerPC 64 (little endian). There are
fallbacks for architectures that don't support it. With two minor tweaks to
Kconfig files, I've built x86 with load_unaligned_zeropad() disabled. Maybe
with today's processors the performance benefits are past their prime,
and running with it disabled in CoCo VMs is the better solution. Does
anyone have a sense of whether the perf impact would be measureable?

If doing the load_unaligned_zeropad() enable/disable at build time is too
limiting, maybe it could be runtime based on whether page private/shared
state is being enforced. I haven't looked at the details.

Thoughts?

Michael

Next message: Mark Brown: "Re: [PATCH] ASoC: fsl_sai: Enable BCI bit if SAI works on synchronous mode with BYP asserted"
Previous message: Linus Torvalds: "Re: Bug in short splice to socket?"
In reply to: Michael Kelley (LINUX): "RE: [PATCHv2 2/3] x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()"
Next in thread: Dave Hansen: "Re: [PATCHv2 2/3] x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]