Re: [PATCH v2] x86/sgx: Replace kmap/kunmap_atomic calls

From: Fabio M. De Francesco
Date: Wed Nov 16 2022 - 05:17:29 EST


On martedì 15 novembre 2022 17:16:26 CET Kristen Carlson Accardi wrote:
> kmap_local_page() is the preferred way to create temporary mappings
> when it is feasible, because the mappings are thread-local and
> CPU-local.

The atomic mappings are thread and CPU local too.

Instead, please note that the most important difference is that
kmap_local_page() won't ever disable (1) pagefaults and (2) (depending on
PREEMPT_RT) preemption.

To make it clearer: I think that your conversions are correct and welcome. I'm
only asking for clarifications about how you show the reasons why they are
good.

> kmap_local_page() uses per-task maps rather than per-CPU
> maps. This in effect removes the need to preemption in the
> local CPU while kmap is active, and thus vastly reduces overall
> system latency.

I'm also experiencing difficulties with the paragraph above. What do you mean
by "This in effect removes the need to preemption in the local CPU while kmap
is active"? I understand that for me, being a non native English speaker, some
sentences may appear weird even in cases where they are perfectly fine.

I can't see why you are expressing the way you are, since kmap_local_page()
does not ever disable preemption.

Let's focus to the call sites you are converting: you know that the code
between kmap_atomic() and kunmap_atomic() can run in non atomic context, thus
there are no real reason to rely on any kmap_atomic() side effects (please
look above again, if necessary).

Therefore, the real good reason for converting are (1) avoid unnecessary
disabling of page faults and (2) avoid the probable disabling of preemption.

> local CPU while kmap is active

> It is also valid to take pagefaults.
>
> The use of kmap_atomic() in the SGX code was not an explicit design
> choice to disable page faults or preemption, and there is no compelling
> design reason to using kmap_atomic() vs. kmap_local_page().

This is at the core of the reasons why you are converting, that is to avoid
the potential overhead (in 32 bit architectures) of switching in atomic
context where it is not required.

> Link: https://lore.kernel.org/linux-sgx/Y0biN3%2FJsZMa0yUr@xxxxxxxxxx/
>
> For more information on the use of kmap_local_page() vs.
> kmap_atomic(), please see Documentation/mm/highmem.rst
>
> Signed-off-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
> ---
> Changes since V1:
>
> - Reword commit message to make it more clear why it is preferred
> to use kmap_local_page() vs. kmap_atomic().
>
> arch/x86/kernel/cpu/sgx/encl.c | 12 ++++++------
> arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++--
> arch/x86/kernel/cpu/sgx/main.c | 8 ++++----
> 3 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 2c258255a629..68f8b18d2278 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -160,8 +160,8 @@ static int __sgx_encl_eldu(struct sgx_encl_page
> *encl_page, return ret;
>
> pginfo.addr = encl_page->desc & PAGE_MASK;
> - pginfo.contents = (unsigned long)kmap_atomic(b.contents);
> - pcmd_page = kmap_atomic(b.pcmd);
> + pginfo.contents = (unsigned long)kmap_local_page(b.contents);
> + pcmd_page = kmap_local_page(b.pcmd);
> pginfo.metadata = (unsigned long)pcmd_page + b.pcmd_offset;
>
> if (secs_page)
> @@ -187,8 +187,8 @@ static int __sgx_encl_eldu(struct sgx_encl_page
> *encl_page, */
> pcmd_page_empty = !memchr_inv(pcmd_page, 0, PAGE_SIZE);
>
> - kunmap_atomic(pcmd_page);
> - kunmap_atomic((void *)(unsigned long)pginfo.contents);
> + kunmap_local(pcmd_page);
> + kunmap_local((void *)(unsigned long)pginfo.contents);
>
> get_page(b.pcmd);
> sgx_encl_put_backing(&b);
> @@ -197,10 +197,10 @@ static int __sgx_encl_eldu(struct sgx_encl_page
> *encl_page,
>
> if (pcmd_page_empty && !reclaimer_writing_to_pcmd(encl,
pcmd_first_page)) {
> sgx_encl_truncate_backing_page(encl,
PFN_DOWN(page_pcmd_off));
> - pcmd_page = kmap_atomic(b.pcmd);
> + pcmd_page = kmap_local_page(b.pcmd);
> if (memchr_inv(pcmd_page, 0, PAGE_SIZE))
> pr_warn("PCMD page not empty after truncate.
\n");
> - kunmap_atomic(pcmd_page);
> + kunmap_local(pcmd_page);
> }
>
> put_page(b.pcmd);
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/
ioctl.c
> index ef874828fa6b..03c50f11ad75 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -221,11 +221,11 @@ static int __sgx_encl_add_page(struct sgx_encl *encl,
> pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl-
>secs.epc_page);
> pginfo.addr = encl_page->desc & PAGE_MASK;
> pginfo.metadata = (unsigned long)secinfo;
> - pginfo.contents = (unsigned long)kmap_atomic(src_page);
> + pginfo.contents = (unsigned long)kmap_local_page(src_page);
>
> ret = __eadd(&pginfo, sgx_get_epc_virt_addr(epc_page));
>
> - kunmap_atomic((void *)pginfo.contents);
> + kunmap_local((void *)pginfo.contents);
> put_page(src_page);
>
> return ret ? -EIO : 0;
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 0aad028f04d4..e5a37b6e9aa5 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -165,17 +165,17 @@ static int __sgx_encl_ewb(struct sgx_epc_page
*epc_page,
> void *va_slot, pginfo.addr = 0;
> pginfo.secs = 0;
>
> - pginfo.contents = (unsigned long)kmap_atomic(backing->contents);
> - pginfo.metadata = (unsigned long)kmap_atomic(backing->pcmd) +
> + pginfo.contents = (unsigned long)kmap_local_page(backing->contents);
> + pginfo.metadata = (unsigned long)kmap_local_page(backing->pcmd) +
> backing->pcmd_offset;
>
> ret = __ewb(&pginfo, sgx_get_epc_virt_addr(epc_page), va_slot);
> set_page_dirty(backing->pcmd);
> set_page_dirty(backing->contents);
>
> - kunmap_atomic((void *)(unsigned long)(pginfo.metadata -
> + kunmap_local((void *)(unsigned long)(pginfo.metadata -
> backing-
>pcmd_offset));

> - kunmap_atomic((void *)(unsigned long)pginfo.contents);
> + kunmap_local((void *)(unsigned long)pginfo.contents);
>
> return ret;
> }
> --
> 2.38.1