Re: [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges

From: Reinette Chatre
Date: Mon Dec 06 2021 - 17:08:12 EST


Hi Jarkko,

On 12/4/2021 3:47 PM, Jarkko Sakkinen wrote:
On Wed, Dec 01, 2021 at 11:23:22AM -0800, Reinette Chatre wrote:
The page reclaimer ensures availability of EPC pages across all
enclaves. In support of this it runs independently from the individual
enclaves in order to take locks from the different enclaves as it writes
pages to swap.

When needing to load a page from swap an EPC page needs to be available for
its contents to be loaded into. Loading an existing enclave page from swap
does not reclaim EPC pages directly if none are available, instead the
reclaimer is woken when the available EPC pages are found to be below a
watermark.

When iterating over a large number of pages in an oversubscribed
environment there is a race between the reclaimer woken up and EPC pages
reclaimed fast enough for the page operations to proceed.

Instead of tuning the race between the page operations and the reclaimer
the page operations instead makes sure that there are EPC pages available.

Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>

Why this needs to be part of this patch set?

When pages are modified they are required to be in the EPC and thus potentially need to be loaded from swap. When needing to modify a large number of pages in an oversubscribed environment there is a problem with the reclaimer providing free EPC pages fast enough for all the page modification operations to proceed.

What that means is that if a user attempts to modify a large range of pages in an oversubscribed environment it is likely that the operation will fail to complete but instead it would result in partial success of as many pages as was on the free list. This is because the reclaimer may not run fast enough to free up sufficient EPC pages in a dynamic way.

This becomes complicated for user space. It could increase the priority of the reclaimer but that has been found to be insufficient*. There would still not be a guarantee that after one page modification call fails enough pages would have been freed up in support of a second page modification call.

With this change it would be ensured that when pages are being modified that there are sufficient EPC pages available to support the modifications.

Reinette

* The test that follows this patch was used to explore this scenario.