Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function

From: Haitao Huang
Date: Sun Nov 26 2023 - 11:27:52 EST


On Mon, 20 Nov 2023 11:45:46 +0800, Huang, Kai <kai.huang@xxxxxxxxx> wrote:

On Mon, 2023-10-30 at 11:20 -0700, Haitao Huang wrote:
From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>

To prepare for per-cgroup reclamation, separate the top-level reclaim
function, sgx_reclaim_epc_pages(), into two separate functions:

- sgx_isolate_epc_pages() scans and isolates reclaimable pages from a given LRU list.
- sgx_do_epc_reclamation() performs the real reclamation for the already isolated pages.

Create a new function, sgx_reclaim_epc_pages_global(), calling those two
in succession, to replace the original sgx_reclaim_epc_pages(). The
above two functions will serve as building blocks for the reclamation
flows in later EPC cgroup implementation.

sgx_do_epc_reclamation() returns the number of reclaimed pages. The EPC
cgroup will use the result to track reclaiming progress.

sgx_isolate_epc_pages() returns the additional number of pages to scan
for current epoch of reclamation. The EPC cgroup will use the result to
determine if more scanning to be done in LRUs in its children groups.

This changelog says nothing about "why", but only mentions the "implementation".

For instance, assuming we need to reclaim @npages_to_reclaim from the
@epc_cgrp_to_reclaim and its descendants, why cannot we do:

for_each_cgroup_and_descendants(&epc_cgrp_to_reclaim, &epc_cgrp) {
if (npages_to_reclaim <= 0)
return;

npages_to_reclaim -= sgx_reclaim_pages_lru(&epc_cgrp->lru,
npages_to_reclaim);
}

Is there any difference to have "isolate" + "reclaim"?


This is to optimize "reclaim". See how etrack was done in sgx_encl_ewb.
Here we just follow the same design as ksgxd for each reclamation cycle.


Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
Co-developed-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
Signed-off-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
Co-developed-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
---


[...]

+/**
+ * sgx_do_epc_reclamation() - Perform reclamation for isolated EPC pages.
+ * @iso: List of isolated pages for reclamation
+ *
+ * Take a list of EPC pages and reclaim them to the enclave's private shmem files. Do not
+ * reclaim the pages that have been accessed since the last scan, and move each of those pages
+ * to the tail of its tracking LRU list.
+ *
+ * Limit the number of pages to be processed up to SGX_NR_TO_SCAN_MAX per call in order to
+ * degrade amount of IPI's and ETRACK's potentially required. sgx_encl_ewb() does degrade a bit
+ * among the HW threads with three stage EWB pipeline (EWB, ETRACK + EWB and IPI + EWB) but not
+ * sufficiently. Reclaiming one page at a time would also be problematic as it would increase
+ * the lock contention too much, which would halt forward progress.

This is kinda optimization, correct? Is there any real performance data to
justify this?

The above sentences were there originally. This optimization was justified.

If this optimization is useful, shouldn't we bring this
optimization to the current sgx_reclaim_pages() instead, e.g., just increase
SGX_NR_TO_SCAN (16) to SGX_NR_TO_SCAN_MAX (32)?


SGX_NR_TO_SCAN_MAX might be designed earlier for other reasons I don't know. Currently it is really the buffer size allocated in sgx_reclaim_pages(). Both cgroup and ksgxd scan 16 pages a time.Maybe we should just use SGX_NR_TO_SCAN. No _MAX needed. The point was to batch reclamation to certain number to minimize impact of EWB pipeline. 16 was the original design.

Thanks
Haitao