Re: [PATCH v9 10/15] x86/sgx: Add EPC reclamation in cgroup try_charge()

From: Jarkko Sakkinen
Date: Mon Feb 19 2024 - 15:20:27 EST


On Mon Feb 19, 2024 at 3:12 PM UTC, Haitao Huang wrote:
> On Tue, 13 Feb 2024 19:52:25 -0600, Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> wrote:
>
> > On Tue Feb 13, 2024 at 1:15 AM EET, Haitao Huang wrote:
> >> Hi Jarkko
> >>
> >> On Mon, 12 Feb 2024 13:55:46 -0600, Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> >> wrote:
> >>
> >> > On Mon Feb 5, 2024 at 11:06 PM EET, Haitao Huang wrote:
> >> >> From: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
> >> >>
> >> >> When the EPC usage of a cgroup is near its limit, the cgroup needs to
> >> >> reclaim pages used in the same cgroup to make room for new
> >> allocations.
> >> >> This is analogous to the behavior that the global reclaimer is
> >> triggered
> >> >> when the global usage is close to total available EPC.
> >> >>
> >> >> Add a Boolean parameter for sgx_epc_cgroup_try_charge() to indicate
> >> >> whether synchronous reclaim is allowed or not. And trigger the
> >> >> synchronous/asynchronous reclamation flow accordingly.
> >> >>
> >> >> Note at this point, all reclaimable EPC pages are still tracked in
> >> the
> >> >> global LRU and per-cgroup LRUs are empty. So no per-cgroup
> >> reclamation
> >> >> is activated yet.
> >> >>
> >> >> Co-developed-by: Sean Christopherson
> >> <sean.j.christopherson@xxxxxxxxx>
> >> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> >> >> Signed-off-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
> >> >> Co-developed-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> >> >> Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> >> >> ---
> >> >> V7:
> >> >> - Split this out from the big patch, #10 in V6. (Dave, Kai)
> >> >> ---
> >> >> arch/x86/kernel/cpu/sgx/epc_cgroup.c | 26 ++++++++++++++++++++++++--
> >> >> arch/x86/kernel/cpu/sgx/epc_cgroup.h | 4 ++--
> >> >> arch/x86/kernel/cpu/sgx/main.c | 2 +-
> >> >> 3 files changed, 27 insertions(+), 5 deletions(-)
> >> >>
> >> >> diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> >> >> b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> >> >> index d399fda2b55e..abf74fdb12b4 100644
> >> >> --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> >> >> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> >> >> @@ -184,13 +184,35 @@ static void
> >> >> sgx_epc_cgroup_reclaim_work_func(struct work_struct *work)
> >> >> /**
> >> >> * sgx_epc_cgroup_try_charge() - try to charge cgroup for a single
> >> EPC
> >> >> page
> >> >> * @epc_cg: The EPC cgroup to be charged for the page.
> >> >> + * @reclaim: Whether or not synchronous reclaim is allowed
> >> >> * Return:
> >> >> * * %0 - If successfully charged.
> >> >> * * -errno - for failures.
> >> >> */
> >> >> -int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
> >> >> +int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg, bool
> >> >> reclaim)
> >> >> {
> >> >> - return misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
> >> PAGE_SIZE);
> >> >> + for (;;) {
> >> >> + if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
> >> >> + PAGE_SIZE))
> >> >> + break;
> >> >> +
> >> >> + if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
> >> >> + return -ENOMEM;
> >> >> + + if (signal_pending(current))
> >> >> + return -ERESTARTSYS;
> >> >> +
> >> >> + if (!reclaim) {
> >> >> + queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);
> >> >> + return -EBUSY;
> >> >> + }
> >> >> +
> >> >> + if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, false))
> >> >> + /* All pages were too young to reclaim, try again a little later
> >> */
> >> >> + schedule();
> >> >
> >> > This will be total pain to backtrack after a while when something
> >> > needs to be changed so there definitely should be inline comments
> >> > addressing each branch condition.
> >> >
> >> > I'd rethink this as:
> >> >
> >> > 1. Create static __sgx_epc_cgroup_try_charge() for addressing single
> >> > iteration with the new "reclaim" parameter.
> >> > 2. Add a new sgx_epc_group_try_charge_reclaim() function.
> >> >
> >> > There's a bit of redundancy with sgx_epc_cgroup_try_charge() and
> >> > sgx_epc_cgroup_try_charge_reclaim() because both have almost the
> >> > same loop calling internal __sgx_epc_cgroup_try_charge() with
> >> > different parameters. That is totally acceptable.
> >> >
> >> > Please also add my suggested-by.
> >> >
> >> > BR, Jarkko
> >> >
> >> > BR, Jarkko
> >> >
> >> For #2:
> >> The only caller of this function, sgx_alloc_epc_page(), has the same
> >> boolean which is passed into this this function.
> >
> > I know. This would be good opportunity to fix that up. Large patch
> > sets should try to make the space for its feature best possible and
> > thus also clean up the code base overally.
> >
> >> If we separate it into sgx_epc_cgroup_try_charge() and
> >> sgx_epc_cgroup_try_charge_reclaim(), then the caller has to have the
> >> if/else branches. So separation here seems not help?
> >
> > Of course it does. It makes the code in that location self-documenting
> > and easier to remember what it does.
> >
> > BR, Jarkko
> >
>
> Please let me know if this aligns with your suggestion.
>
>
> static int ___sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
> {
> if (!misc_cg_try_charge(MISC_CG_RES_SGX_EPC, epc_cg->cg,
> PAGE_SIZE))
> return 0;
>
> if (sgx_epc_cgroup_lru_empty(epc_cg->cg))
> return -ENOMEM;
>
> if (signal_pending(current))
> return -ERESTARTSYS;
>
> return -EBUSY;
> }
>
> /**
> * sgx_epc_cgroup_try_charge() - try to charge cgroup for a single page
> * @epc_cg: The EPC cgroup to be charged for the page.
> *
> * Try to reclaim pages in the background if the group reaches its limit
> and
> * there are reclaimable pages in the group.
> * Return:
> * * %0 - If successfully charged.
> * * -errno - for failures.
> */
> int sgx_epc_cgroup_try_charge(struct sgx_epc_cgroup *epc_cg)
> {
> int ret = ___sgx_epc_cgroup_try_charge(epc_cg);
>
> if (ret == -EBUSY)
> queue_work(sgx_epc_cg_wq, &epc_cg->reclaim_work);
>
> return ret;
> }
>
> /**
> * sgx_epc_cgroup_try_charge_reclaim() - try to charge cgroup for a single
> page
> * @epc_cg: The EPC cgroup to be charged for the page.
> *
> * Try to reclaim pages directly if the group reaches its limit and there
> are
> * reclaimable pages in the group.
> * Return:
> * * %0 - If successfully charged.
> * * -errno - for failures.
> */
> int sgx_epc_cgroup_try_charge_reclaim(struct sgx_epc_cgroup *epc_cg)
> {
> int ret;
>
> for (;;) {
> ret = ___sgx_epc_cgroup_try_charge(epc_cg);
> if (ret != -EBUSY)
> return ret;
>
> if (!sgx_epc_cgroup_reclaim_pages(epc_cg->cg, current->mm))
> /* All pages were too young to reclaim, try again
> a little later */
> schedule();
> }
>
> return 0;
> }
>
> It is a little more involved to remove the boolean for
> sgx_alloc_epc_page() and its callers like sgx_encl_grow(),
> sgx_alloc_va_page(). I'll send a separate patch for comments.

With quick look, it is towards right direction for sure.

BR, Jarkko