Re: [PATCH] [v2] x86/sgx: Allow enclaves to use Asynchrounous Exit Notification

From: Haitao Huang
Date: Thu Jul 28 2022 - 13:54:53 EST


On Tue, 26 Jul 2022 16:21:53 -0500, Kai Huang <kai.huang@xxxxxxxxx> wrote:

On Tue, 2022-07-26 at 10:28 -0500, Haitao Huang wrote:
On Tue, 26 Jul 2022 05:47:14 -0500, Kai Huang <kai.huang@xxxxxxxxx> wrote:

> On Tue, 2022-07-26 at 00:10 -0500, Haitao Huang wrote:
> > On Mon, 25 Jul 2022 05:36:17 -0500, Kai Huang <kai.huang@xxxxxxxxx>
> > wrote:
> >
> > > On Fri, 2022-07-22 at 08:21 -0700, Dave Hansen wrote:
> > > > On 7/22/22 06:26, Kai Huang wrote:
> > > > > Did a quick look at the spec. It appears ENCLU[EDECCSSA] should
> > be
> > > > used
> > > > > together with AEX-notify. So besides advertising the new
> > > > > SGX_ATTR_ASYNC_EXIT_NOTIFY bit to the KVM guest, I think we should
> > > > also
> > > > > advertise the ENCLU[EDECCSSA] support in guest's CPUID, like below
> > > > (untested)?
> > > >
> > > > Sounds like a great follow-on patch! It doesn't seem truly
> > functionally
> > > > required from the spec:
> > > >
> > > > > EDECCSSA is a new Intel SGX user leaf function
> > > > > (ENCLU[EDECCSSA]) that can facilitate AEX notification handling...
> > > >
> > > > If that's wrong or imprecise, I'd love to hear more about it and
> > also
> > > > about how the spec will be updated.
> > > >
> > >
> > > They are enumerated separately, but looks in practice the notify
> > handler
> > > will
> > > use it to switch back to the correct/targeted CSSA to continue to run
> > > normally
> > > after handling the exit notify. This is my understanding of the
> > > "facilitate"
> > > mean in the spec.
> > >
> > > Btw, in real hardware I think the two should come together, meaning no
> > > real
> > > hardware will only support one.
> > >
> > > Haitao, could you give us more information?
> > >
> > You are right. They are enumerated separately and HW should come with
> > both
> > or neither.
> > My understanding it is also possible for enclaves choose not to receive
> > AEX notify
> > but still use EDECCSSA.
> >
>
> What is the use case of using EDECCSSA w/o using AEX notify?
> If I understand correctly EDECCSSA effectively switches to another
> thread (using
> the previous SSA, which is the context of another TCS thread if I
> understand
> correctly). Won't this cause problem?

No. Decrementing CSSA is equivalent to popping stack frames, not switching
threads.
In some cases such as so-called "first stage" exception handling, one
could pop CSSA back to the previous after resetting CPU context and stack
frame appropriate to the "second stage" or "real" exception handling
routine, then jump to the handler directly. This could improve exception
handling performance by saving an EEXIT/ERESUME trip.



Looking at the AEX-notify spec again, EDECCSSA does below:

(* At this point, the instruction is guaranteed to complete *)
CR_TCS_PA.CSSA := CR_TCS_PA.CSSA - 1;
CR_GPR_PA := Physical_Address(DS:TMP_GPR);

It doens't reset the RIP to CR_GPA_PA.RIP so looks yes you are right. It only
"popping the stack frame" but doesn't switch thread.

But the pseudo code of EDECCSSA only updates the CR_TCS_PA and CR_GPR_PA
registers (forget about XSAVE not), but doesn't manually updating the actual CPU
registers such as GPRs. Are the actual CPU registers updated automatically when
CR_xx are updated?

No, the enclave code is supposed to do that. Here is are a few more details on the flow I mentioned.

On any AEX event, CPU saves states including GPR/XSave into SSA[0]. When AEX-notify is turned off, for enclaves to handle exceptions occurred inside enclave, user space must do EENTER with the same TCS on which the exception occurred. EENTER would give a clean slate of GPR and SSA[1] becomes active for next AEX. It's enclave's responsibility to save GPR/XSave states in SSA[0] to some place (e.g., stack), then EDECCSSA, then jump to the "second stage" handler. (Note now SSA[0] is reactivated and ready if another AEX occurs). The second stage handler then fixes the situation that caused the original AEX, restore CPU context from the saved SSA[0] states, jump back to original place where exception happened.