RE: [PATCHv3 0/3] x86/tdx: Fix one more load_unaligned_zeropad() issue

From: Michael Kelley (LINUX)
Date: Thu Jul 06 2023 - 12:48:40 EST


From: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Sent: Tuesday, June 6, 2023 2:56 AM
>
> During review of TDX guests on Hyper-V patchset Dave pointed to the
> potential race between changing page private/shared status and
> load_unaligned_zeropad().
>
> Fix the issue.
>
> v3:
> - Fix grammar;
> - Add Sathya's Reviewed-bys;
> v2:
> - Add more info in commit message of the first patch.
> - Move enc_status_change_finish_noop() into a separate patch.
> - Fix typo in commit message and comment.
>
> Kirill A. Shutemov (3):
> x86/mm: Allow guest.enc_status_change_prepare() to fail
> x86/tdx: Fix race between set_memory_encrypted() and
> load_unaligned_zeropad()
> x86/mm: Fix enc_status_change_finish_noop()
>
> arch/x86/coco/tdx/tdx.c | 64 +++++++++++++++++++++++++++++++--
> arch/x86/include/asm/x86_init.h | 2 +-
> arch/x86/kernel/x86_init.c | 4 +--
> arch/x86/mm/mem_encrypt_amd.c | 4 ++-
> arch/x86/mm/pat/set_memory.c | 3 +-
> 5 files changed, 69 insertions(+), 8 deletions(-)
>
> --
> 2.39.3

These fixes notwithstanding, load_unaligned_zeropad() is not handled
properly in a TDX VM. The problem was introduced with commit
c4e34dd99f2e, which moved the fixup code to function
ex_handler_zeropad(). This new function does a verification against
fault_addr, and the verification always fails because fault_addr is zero.
The call sequence is:

exc_virtualization_exception()
ve_raise_fault()
gp_try_fixup_and_notify() <-- always passes 0 as fault_addr
fixup_exception()
ex_handler_zeropad()

The validation of fault_addr could probably be removed since
such validation wasn't there prior to c4e34dd99f2e. But before
going down that path, I want to propose a different top-level
solution to the interaction between load_unaligned_zeropad()
and CoCo VM page transitions between private and shared.

When a page is transitioning, the caller can and should ensure
that it is not being accessed during the transition. But we have
the load_unaligned_zeropad() wildcard. So do the following for
the transition sequence in __set_memory_enc_pgtable():

1. Remove aliasing mappings
2. Remove the PRESENT bit from the PTEs of all transitioning pages
3. Flush the TLB globally
4. Flush the data cache if needed
5. Set/clear the encryption attribute as appropriate
6. Notify the hypervisor of the page status change
7. Add back the PRESENT bit

With this approach, load_unaligned_zeropad() just takes the
normal page-fault-based fixup path if it touches a page that is
transitioning. As a result, load_unaligned_zeropad() and CoCo
VM page transitioning are completely decoupled. We don't
need to handle architecture-specific CoCo VM exceptions and
fix things up.

I've posted an RFC PATCH that implements this approach [1],
and tested on TDX VMs and SEV-SNP VMs in vTOM mode.
The RFC PATCH has more details on the benefits and
implications. Follow-up discussion should probably be done
on that email thread.

Michael

[1] https://lore.kernel.org/lkml/1688661719-60329-1-git-send-email-mikelley@xxxxxxxxxxxxx/T/#u