RE: [RFC PATCH v1 2/3] x86/vdso: Modify __vdso_sgx_enter_enclave() to allow parameter passing on untrusted stack

From: Xing, Cedric
Date: Wed Apr 24 2019 - 13:58:40 EST


Hi Andy,

I sent out my RFC patch v2 last night, that has your suggestions incorporated, plus a new unwind test to single step through the vDSO API to test out the CFI directives. Hopefully it is able to address all of your concerns. It's worth noting that, given the current patch fixes up #DB and #BP at ENCLU, the unwind test cannot run to completion. I assume Sean will revise the fixup code soon.

Thanks!

-Cedric

> -----Original Message-----
> From: Andy Lutomirski [mailto:luto@xxxxxxxxxx]
> Sent: Monday, April 22, 2019 6:26 PM
> To: Xing, Cedric <cedric.xing@xxxxxxxxx>
> Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>; X86 ML <x86@xxxxxxxxxx>; linux-
> sgx@xxxxxxxxxxxxxxx; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Hansen,
> Dave <dave.hansen@xxxxxxxxx>; Christopherson, Sean J
> <sean.j.christopherson@xxxxxxxxx>; nhorman@xxxxxxxxxx;
> npmccallum@xxxxxxxxxx; Ayoun, Serge <serge.ayoun@xxxxxxxxx>; Katz-zamir,
> Shay <shay.katz-zamir@xxxxxxxxx>; Huang, Haitao <haitao.huang@xxxxxxxxx>;
> Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>; Thomas Gleixner
> <tglx@xxxxxxxxxxxxx>; Svahn, Kai <kai.svahn@xxxxxxxxx>; Borislav Petkov
> <bp@xxxxxxxxx>; Josh Triplett <josh@xxxxxxxxxxxxxxxx>; Andrew Lutomirski
> <luto@xxxxxxxxxx>; Huang, Kai <kai.huang@xxxxxxxxx>; David Rientjes
> <rientjes@xxxxxxxxxx>; Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx>
> Subject: Re: [RFC PATCH v1 2/3] x86/vdso: Modify
> __vdso_sgx_enter_enclave() to allow parameter passing on untrusted stack
>
> On Mon, Apr 22, 2019 at 5:37 PM Cedric Xing <cedric.xing@xxxxxxxxx>
> wrote:
> >
> > The previous __vdso_sgx_enter_enclave() requires enclaves to preserve
> > %rsp, which prohibits enclaves from allocating and passing parameters
> > for untrusted function calls (aka. o-calls).
> >
> > This patch addresses the problem above by introducing a new ABI that
> > preserves %rbp instead of %rsp. Then __vdso_sgx_enter_enclave() can
> > anchor its frame using %rbp so that enclaves are allowed to allocate
> > space on the untrusted stack by decrementing %rsp. Please note that
> > the stack space allocated in such way will be part of
> > __vdso_sgx_enter_enclave()'s frame so will be freed after
> > __vdso_sgx_enter_enclave() returns. Therefore,
> > __vdso_sgx_enter_enclave() has been changed to take a callback
> > function as an optional parameter, which if supplied, will be invoked
> > upon enclave exits (both AEX (Asynchronous Enclave
> > eXit) and normal exits), with the value of %rsp left off by the
> > enclave as a parameter to the callback.
> >
> > Here's the summary of API/ABI changes in this patch. More details
> > could be found in arch/x86/entry/vdso/vsgx_enter_enclave.S.
> > * 'struct sgx_enclave_exception' is renamed to 'struct
> sgx_enclave_exinfo'
> > because it is filled upon both AEX (i.e. exceptions) and normal
> enclave
> > exits.
> > * __vdso_sgx_enter_enclave() anchors its frame using %rbp (instead
> of %rsp in
> > the previous implementation).
> > * __vdso_sgx_enter_enclave() takes one more parameter - a callback
> function to
> > be invoked upon enclave exits. This callback is optional, and if not
> > supplied, will cause __vdso_sgx_enter_enclave() to return upon
> enclave exits
> > (same behavior as previous implementation).
> > * The callback function is given as a parameter the value of %rsp at
> enclave
> > exit to address data "pushed" by the enclave. A positive value
> returned by
> > the callback will be treated as an ENCLU leaf for re-entering the
> enclave,
> > while a zero or negative value will be passed through as the return
> > value of __vdso_sgx_enter_enclave() to its caller. It's also safe to
> > leave callback by longjmp() or by throwing a C++ exception.
> >
> > Signed-off-by: Cedric Xing <cedric.xing@xxxxxxxxx>
> > ---
> > arch/x86/entry/vdso/vsgx_enter_enclave.S | 156 ++++++++++++++--------
> -
> > arch/x86/include/uapi/asm/sgx.h | 14 +-
> > 2 files changed, 100 insertions(+), 70 deletions(-)
> >
> > diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > index fe0bf6671d6d..210f4366374a 100644
> > --- a/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > +++ b/arch/x86/entry/vdso/vsgx_enter_enclave.S
> > @@ -14,88 +14,118 @@
> > .code64
> > .section .text, "ax"
> >
> > -#ifdef SGX_KERNEL_DOC
> > /**
> > * __vdso_sgx_enter_enclave() - Enter an SGX enclave
> > *
> > * @leaf: **IN \%eax** - ENCLU leaf, must be EENTER or ERESUME
> > - * @tcs: **IN \%rbx** - TCS, must be non-NULL
> > - * @ex_info: **IN \%rcx** - Optional 'struct sgx_enclave_exception'
> pointer
> > + * @tcs: **IN 0x08(\%rsp)** - TCS, must be non-NULL
> > + * @ex_info: **IN 0x10(\%rsp)** - Optional 'struct
> sgx_enclave_exinfo'
> > + * pointer
> > + * @callback: **IN 0x18(\%rsp)** - Optional callback function to be
> called on
> > + * enclave exit or exception
> > *
> > * Return:
> > * **OUT \%eax** -
> > - * %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU
> > leaf is
> > - * not allowed or if TCS is NULL, %-EFAULT if ENCLU or the enclave
> > faults
> > + * %0 on a clean entry/exit to/from the enclave, %-EINVAL if ENCLU
> > + leaf is not
> > + * allowed, %-EFAULT if ENCLU or the enclave faults, or a
> > + non-positive value
> > + * returned from ``callback`` (if one is supplied).
> > *
> > * **Important!** __vdso_sgx_enter_enclave() is **NOT** compliant
> with the
> > - * x86-64 ABI, i.e. cannot be called from standard C code. As noted
> above,
> > - * input parameters must be passed via ``%eax``, ``%rbx`` and
> > ``%rcx``, with
> > - * the return value passed via ``%eax``. All registers except
> > ``%rsp`` must
> > - * be treated as volatile from the caller's perspective, including
> > but not
> > - * limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc... Conversely, the
> > enclave
> > - * being run **must** preserve the untrusted ``%rsp`` and stack.
> > + * x86-64 ABI, i.e. cannot be called from standard C code. As noted
> > + above,
> > + * input parameters must be passed via ``%eax``, ``8(%rsp)``,
> > + ``0x10(%rsp)`` and
> > + * ``0x18(%rsp)``, with the return value passed via ``%eax``. All
> > + other registers
> > + * will be passed through to the enclave as is. All registers except
> > + ``%rbp``
> > + * must be treated as volatile from the caller's perspective,
> > + including but not
> > + * limited to GPRs, EFLAGS.DF, MXCSR, FCW, etc... Conversely, the
> > + enclave being
> > + * run **must** preserve the untrusted ``%rbp``.
> > + *
> > + * ``callback`` has the following signature:
> > + * int callback(long rdi, long rsi, long rdx,
> > + * struct sgx_enclave_exinfo *ex_info, long r8, long r9,
> > + * void *tcs, long ursp);
> > + * ``callback`` **shall** follow x86_64 ABI. All GPRs **except**
> > + ``%rax``, ``%rbx``
> > + * and ``rcx`` are passed through to ``callback``. ``%rdi``,
> > + ``%rsi``, ``%rdx``,
> > + * ``%r8``, ``%r9``, along with the value of ``%rsp`` when the
> > + enclave exited/excepted,
> > + * can be accessed directly as input parameters, while other GPRs can
> > + be
> > + * accessed in assembly if needed.
> > + * A positive value returned from ``callback`` will be treated as an
> > + ENCLU leaf
> > + * (e.g. EENTER/ERESUME) to reenter the enclave, while 0 or a
> > + negative return
>
> "to reenter the enclave without popping the extra data pushed by the
> enclave off the stack" or similar. We really don't want a situation
> where someone puts all there "keep on going" logic in the callback and
> the stack usage grows without bound.
>
> > + * value will be passed back to the caller of
> __vdso_sgx_enter_enclave().
> > + * It is also **safe** to ``longjmp()`` out of ``callback``.
>
> I'm not sure that "safe" needs emphasis.
>
> > */
> > -__vdso_sgx_enter_enclave(u32 leaf, void *tcs,
> > - struct sgx_enclave_exception *ex_info)
> > -{
> > - if (leaf != SGX_EENTER && leaf != SGX_ERESUME)
> > - return -EINVAL;
> > -
> > - if (!tcs)
> > - return -EINVAL;
> > -
> > - try {
> > - ENCLU[leaf];
> > - } catch (exception) {
> > - if (e)
> > - *e = exception;
> > - return -EFAULT;
> > - }
> > -
> > - return 0;
> > -}
> > -#endif
> > ENTRY(__vdso_sgx_enter_enclave)
> > - /* EENTER <= leaf <= ERESUME */
> > + /* Prolog */
> > + .cfi_startproc
> > + push %rbp
> > + .cfi_adjust_cfa_offset 8
> > + .cfi_rel_offset %rbp, 0
> > + mov %rsp, %rbp
> > + .cfi_def_cfa_register %rbp
> > +
> > +1: /* EENTER <= leaf <= ERESUME */
> > cmp $0x2, %eax
> > - jb bad_input
> > -
> > + jb 6f
> > cmp $0x3, %eax
> > - ja bad_input
> > + ja 6f
> >
> > - /* TCS must be non-NULL */
> > - test %rbx, %rbx
> > - je bad_input
> > + /* Load TCS and AEP */
> > + mov 0x10(%rbp), %rbx
> > + lea 2f(%rip), %rcx
> >
> > - /* Save @exception_info */
> > - push %rcx
> > -
> > - /* Load AEP for ENCLU */
> > - lea 1f(%rip), %rcx
> > -1: enclu
> > -
> > - add $0x8, %rsp
> > - xor %eax, %eax
> > - ret
> > -
> > -bad_input:
> > - mov $(-EINVAL), %rax
> > - ret
> > -
> > -.pushsection .fixup, "ax"
> > - /* Re-load @exception_info and fill it (if it's non-NULL) */
> > -2: pop %rcx
> > - test %rcx, %rcx
> > - je 3f
> > + /* Single ENCLU serving as both EENTER and AEP (ERESUME) */
> > +2: enclu
> >
> > + /* EEXIT path */
> > + xor %ebx, %ebx
> > +3: mov 0x18(%rbp), %rcx
> > + jrcxz 4f
> > mov %eax, EX_LEAF(%rcx)
> > - mov %di, EX_TRAPNR(%rcx)
> > - mov %si, EX_ERROR_CODE(%rcx)
> > + jnc 4f
> > + mov %di, EX_TRAPNR(%rcx)
> > + mov %si, EX_ERROR_CODE(%rcx)
> > mov %rdx, EX_ADDRESS(%rcx)
> > -3: mov $(-EFAULT), %rax
> > +
> > +4: /* Call *callback if supplied */
> > + mov 0x20(%rbp), %rax
> > + test %rax, %rax
>
> Maybe have a comment like "At this point, the effective return value is
> in RBX. If there is no callback, then return it."
>
> > + cmovz %rbx, %rax
> > + jz 7f
> > + /* Align stack and clear RFLAGS.DF per x86_64 ABI */
> > + mov %rsp, %rbx
>
> Whoa, this is too subtle here. Can you update the comment to clarify
> that the uRSP value set by the enclave needs to be saved so that the
> enclave can be resumed if needed?
>
> > + and $-0x10, %rsp
> > + cld
> > + /* Parameters for *callback */
> > + push %rbx
> > + push 0x10(%rbp)
> > + /* Call via retpoline */
> > + call 40f
> > + /* Cleanup stack */
> > + mov %rbx, %rsp
>
> To me, "Cleanup stack" makes me think that you're restoring the original
> RSP, but you're actually just undoing in the stack alignment.
> How about "Undo stack alignment"?
>
> But I'm not seeing the code that causes a return value RAX <= 0 to just
> return.
>
> > + jmp 1b
> > +40: /* retpoline */
> > + call 42f
> > +41: pause
> > + lfence
> > + jmp 41b
> > +42: mov %rax, (%rsp)
> > + ret
> > +
> > +5: /* Exception path */
> > + mov $-EFAULT, %ebx
> > + stc
> > + jmp 3b
> > +
> > +6: /* Unsupported ENCLU leaf */
> > + cmp $0, %eax
> > + jle 7f
> > + mov $-EINVAL, %eax
> > +
> > +7: /* Epilog */
> > + leave
> > + .cfi_def_cfa %rsp, 8
> > ret
> > -.popsection
> > + .cfi_endproc
> >
> > -_ASM_VDSO_EXTABLE_HANDLE(1b, 2b)
> > +_ASM_VDSO_EXTABLE_HANDLE(2b, 5b)
>
> --Andy