Re: [PATCH 1/4] x86/sgx: Track phase and type of SGX EPC pages

From: Sean Christopherson
Date: Thu Jul 15 2021 - 11:33:47 EST


On Wed, Jul 14, 2021, Luck, Tony wrote:
> > I've no objection to tracking the type for SGX2, my argument in the context of
> > #MC support is that there should be no need to track the type. Either the #MC
> > is recoverable or it isn't, and the enclave is toast regardless of what type of
> > page hit the #MC.
>
> I'll separate the "phase" from the "type".
>
> Here phase is used for the life-cycle of EPC pages:
>
> DIRTY -> FREE -> IN-USE -> DIRTY

Not that it affects anything, but that's not quite true. In hardware, pages are
either FREE or IN-USE, there is no concept of DIRTY. DIRTY is the kernel's
arbitrary description of a page that has not been sanitized and so is considered
to be in an unknown state, i.e. the kernel doesn't know if it's FREE or IN-USE.

Once a page is sanitized (during boot), its state is known and the page is never
put back on the so called dirty list, i.e. the software flow is:

DIRTY -> FREE -> IN-USE -> FREE

> Errors can be reported by memory controller page scrubbers for pages that are
> not "IN-USE" ... and the recovery action is just to make sure that they are
> never allocated.
>
> When a page is IN-USE ... it has a "type". I currently only have a way to
> inject errors into SGX_PAGE_TYPE_REG pages. That means initial recovery code
> is going to focus on those since that is all I can test. But I'll try not to
> special case them as far as possible.

Inability to test expected behavior doesn't mean we shouldn't implement towards
the expected behavior, i.e. someone somewhere must know how SECS and VA pages
behave in response to a memory error.