Re: [PATCH V2 16/32] x86/sgx: Support restricting of enclave page permissions

From: Reinette Chatre
Date: Wed Feb 23 2022 - 17:42:34 EST


Hi Vijay,

On 2/23/2022 11:21 AM, Dhanraj, Vijay wrote:
> Hi All,
>
> Regarding the recent update of splitting the page permissions changerequest
> into two IOCTLS (RELAX and RESTRICT), can we combine them into one? That is,
> revert to how it was done in the v1 version?

While V1 did have a single ioctl() to handle both relaxing and restricting
permissions it never was possible for the kernel to distinguish what the
user intended. For this reason, even though there was a single ioctl() in V1,
it implemented permission restriction while supporting permission
relaxing as a side effect since the PTEs are flushed and new PTEs will
support the new permission. A consequence was that the V1 SGX_IOC_PAGE_MODP
required ENCLU[EACCEPT] from within the enclave even if it was only intended
to be used to relax permissions. SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS in
V2 is exactly the same as SGX_IOC_PAGE_MODP of V1.

>
> Why? Currently in Gramine (a library OS for unmodified applications,
> https://gramineproject.io/) with the new proposed change, one needs
> to store the page permission for each page or range of pages. And for
> every request of `mmap` or `mprotect`, Gramine would have to do a lookup
> of the page permissions for the request range and then call the respective
> IOCTL either RESTRICT or RELAX. This seems a little overwhelming.

Gramine would also need to know when to enter the enclave to run EMODPE, which
goes in hand with running SGX_IOC_ENCLAVE_RELAX_PERMISSIONS.

>
> Request: Instead, can we do `MODPE`, call `RESTRICT` IOCTL, and then do an
> `EACCEPT` irrespective of RELAX or RESTRICT page permission request? With this
> approach, we can avoid storing page permissions and simplify the implementation.

This should be possible with the current implementation, similar to previous
implementation, but not optimal if only EMODPE followed by
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS is what is needed.

>
> I understand RESTRICT IOCTL would do a `MODPR` and trigger `ETRACK` flows to do
> TLB shootdowns which might not be needed for RELAX IOCTL but I am not sure what
> will be the performance impact. Is there any data point to see the performance impact?

It can be worse than just that. EMODPR requires the EPC page to be present
and thus the page would need to be loaded from swap and decrypted if it
is not present. This may also mean that existing EPC pages need to be
swapped out (first blocked, then encrypted to backing storage, then the
ETRACK flow followed by IPIs to ensure there are no more references to that
page) ... before there is space available for needed page to be loaded and
decrypted.

That only takes care of the EMODPR ... which as you state needs
to be followed by the ETRACK flow and IPIs.

The above is also just for the OS portion - after that there is the
EACCEPT that needs to be run from within the enclave for every page whether
permissions were relaxed or restricted. This would be dependent on the
implementation - whether the enclave is entered once per EACCEPT or once
for all EACCEPTs.

All of the above would be unnecessary if permissions were just relaxed from
within the enclave while SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS used to
perform the OS actions.

The performance impact should be easy to determine: run both ioctl()s
and compare how long they take. Since you are asking about Gramine this may be
best to do in that environment but I can attempt something on your behalf by
using the existing SGX selftest infrastructure.

As an experiment I modified the existing "unclobbered_vdso_oversubscribed_remove"
test case that currently runs the SGX_IOC_ENCLAVE_MODIFY_TYPE on a large memory
region to instead run ioctl()s SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS. In my test I ran these ioctl()s on a 4GB
memory range to amplify any performance impact since I was just measuring it
by printing timestamps from user space.

My result showed that:
* Running SGX_IOC_ENCLAVE_RELAX_PERMISSIONS on the 4GB region took less than a second
No EACCEPT needed from user space.

* Running SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS on the 4GB region took about 20 seconds.
* Running EACCEPT on each enclave page took an additional 20 seconds. (Please note that
this is using a sub obtimal way of entering the enclave for each EACCEPT where it
would be more efficient to enter the enclave once and run EACCEPT for each page without
exiting the enclave.)

The performance impact seems significant to me.

Reinette