Re: [PATCHv3 0/3] x86/tdx: Fix one more load_unaligned_zeropad() issue

From: Kirill A. Shutemov
Date: Mon Jul 24 2023 - 19:19:38 EST


On Thu, Jul 13, 2023 at 02:43:39PM +0000, Michael Kelley (LINUX) wrote:
> From: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> Sent: Saturday, July 8, 2023 11:09 PM
> >
> > On Sat, Jul 08, 2023 at 11:53:08PM +0000, Michael Kelley (LINUX) wrote:
> > > From: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> Sent: Friday, July 7, 2023 7:07 AM
> > > >
> > > > On Thu, Jul 06, 2023 at 04:48:32PM +0000, Michael Kelley (LINUX) wrote:
> > > > > From: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Sent: Tuesday, June 6,
> > 2023 2:56 AM
> > >
> > > [snip]
> > >
> > > >
> > > > It only addresses the problem that happens on transition, but
> > > > load_unaligned_zeropad() is still a problem for the shared mappings in
> > > > general, after transition is complete. Like if load_unaligned_zeropad()
> > > > steps from private to shared mapping and shared mapping triggers #VE,
> > > > kernel should be able to handle it.
> > >
> > > I'm showing my ignorance of TDX architectural details, but what's the
> > > situation where shared mappings in general can trigger a #VE? How
> > > do such situations get handled for references that aren't from
> > > load_unaligned_zeropad()?
> > >
> >
> > Shared mappings are under host/VMM control. It can just not map the page
> > in shared-ept and trigger ept-violation #VE.
>
> I know you are out on vacation, but let me follow up now for further
> discussion when you are back.
>
> Isn't the scenario you are describing a malfunctioning or malicious
> host/VMM? Would what you are describing be done as part of normal
> operation? Kernel code must have switched the page from private to
> shared for some purpose. As soon as that code (which presumably
> does not have any entry in the exception table) touches the page, it
> would take the #VE and the enter the die path because there's no fixup.
> So is there value in having load_unaligned_zeropad() handle the #VE and
> succeed where a normal reference would fail?

#VE on shared memory is legitimately used for MMIO. But MMIO region is
usually separate from the real memory in physical address space.

But we also have DMA.

DMA pages allocated from common pool of memory and they can be next to
dentry cache that kernel accesses with load_unaligned_zeropad(). DMA pages
are shared, but they usually backed by memory and not cause #VE. However
shared memory is under full control from VMM and VMM can remove page at
any point which would make platform to deliver #VE to the guest. This is
pathological scenario, but I think it still worth handling gracefully.

> I'd still like to see the private <-> shared transition code mark the pages
> as invalid during the transition, and avoid the possibility of #VE and
> similar cases with SEV-SNP. Such approach reduces (eliminates?)
> entanglement between CoCo-specific exceptions and
> load_unaligned_zeropad(). It also greatly simplifies TD Partition cases
> and SEV-SNP cases where a paravisor is used.

I doesn't eliminates issue for TDX as the scenario above is not transient.
It can happen after memory is converted to shared.

--
Kiryl Shutsemau / Kirill A. Shutemov