Re: [PATCH 4/4] selftests/sgx: Trigger the reclaimer and #PF handler

From: Reinette Chatre
Date: Wed Jul 07 2021 - 17:20:09 EST


Hi Jarkko,

On 7/7/2021 1:50 PM, Jarkko Sakkinen wrote:
On Wed, Jul 07, 2021 at 08:02:42AM -0700, Reinette Chatre wrote:
Hi Jarkko,

On 7/7/2021 2:17 AM, Jarkko Sakkinen wrote:
On Tue, Jul 06, 2021 at 05:10:38PM -0700, Reinette Chatre wrote:
Hi Jarkko,

On 7/6/2021 4:50 PM, Jarkko Sakkinen wrote:
On Tue, Jul 06, 2021 at 11:34:54AM -0700, Reinette Chatre wrote:
Hi Jarkko,

On 7/5/2021 7:36 AM, Jarkko Sakkinen wrote:
Create a heap for the test enclave, which has the same size as all
available Enclave Page Cache (EPC) pages in the system. This will guarantee
that all test_encl.elf pages *and* SGX Enclave Control Structure (SECS)
have been swapped out by the page reclaimer during the load time. Actually,
this adds a bit more stress than that since part of the EPC gets reserved
for the Version Array (VA) pages.

For each test, the page fault handler gets triggered in two occasions:

- When SGX_IOC_ENCLAVE_INIT is performed, SECS gets swapped in by the
page fault handler.
- During the execution, each page that is referenced gets swapped in
by the page fault handler.


If I understand this correctly, all EPC pages are now being consumed during
fixture setup and thus every SGX test, no matter how big or small, now
becomes a stress test of the reclaimer instead of there being a unique
reclaimer test. Since an enclave is set up and torn down for every test this
seems like a significant addition. It also seems like this would impact
future tests of dynamic page addition where not all scenarios could be
tested with all EPC pages already consumed.

Reinette

Re-initializing the test enclave is mandatory thing to do for all tests
because it has an internals state.


Right, but not all tests require the same enclave. In kselftest terminology
I think you are attempting to force all tests to depend on the same test
fixture. Is it not possible to have a separate "reclaimer" test fixture that
would build an enclave with a large heap and then have reclaimer tests that
exercise it by being tests that are specific to this "reclaimer fixture"?

Reinette

Why add that complexity?


With this change every test is turned into a pseudo reclaimer test without
there being any explicit testing (with pass/fail criteria) of reclaimer
behavior. This is an expensive addition and reduces the scenarios that the
tests can exercise.

Reinette

There is consistent known behaviour how reclaimer and also the page fault
are exercised for each test. I think that is what matters most right now
that the basic behaviour of both the page reclaimer and page fault handler
gets exercised.

I believe the basic behavior of page fault handler is currently exercised in each test, this is required.


I don't understand the real-world gain of doing something factors more
complex than necessary at a particular point of time, when you don't
really need to hang yourself into it forever.

Your argument about "hang yourself into it forever" can go both ways - why should all tests now unnecessarily consume the entire EPC forever?

If I understand correctly adding a separate reclaimer test is not complex but would require refactoring code.

This patch does increase the coverage in a deterministic manner to the code
paths that were not previously exercised, i.e. we know the code paths, and
could even calculate the exact number of times that they are triggered. And
without doing anything obscure. That's what matters to me.

On the contrary this is indeed obfuscating the SGX tests: if an issue shows up in the reclaimer then all tests would fail. If there is a unique reclaimer test then that would help point to where the issue may be.

Reinette