Re: [PATCH v3 4/4] arm64: mte: Optimize mte_assign_mem_tag_range()

From: Mark Rutland
Date: Fri Jan 15 2021 - 10:47:01 EST


On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote:
> mte_assign_mem_tag_range() is called on production KASAN HW hot
> paths. It makes sense to optimize it in an attempt to reduce the
> overhead.
>
> Optimize mte_assign_mem_tag_range() based on the indications provided at
> [1].

... what exactly is the optimization?

I /think/ you're just trying to have it inlined, but you should mention
that explicitly.

>
> [1] https://lore.kernel.org/r/CAAeHK+wCO+J7D1_T89DG+jJrPLk3X9RsGFKxJGd0ZcUFjQT-9Q@xxxxxxxxxxxxxx/
>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
> ---
> arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++-
> arch/arm64/lib/mte.S | 15 ---------------
> 2 files changed, 25 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 1a715963d909..9730f2b07b79 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task);
> int mte_ptrace_copy_tags(struct task_struct *child, long request,
> unsigned long addr, unsigned long data);
>
> -void mte_assign_mem_tag_range(void *addr, size_t size);
> +static inline void mte_assign_mem_tag_range(void *addr, size_t size)
> +{
> + u64 _addr = (u64)addr;
> + u64 _end = _addr + size;
> +
> + /*
> + * This function must be invoked from an MTE enabled context.
> + *
> + * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
> + * size must be non-zero and MTE_GRANULE_SIZE aligned.
> + */
> + do {
> + /*
> + * 'asm volatile' is required to prevent the compiler to move
> + * the statement outside of the loop.
> + */
> + asm volatile(__MTE_PREAMBLE "stg %0, [%0]"
> + :
> + : "r" (_addr)
> + : "memory");
> +
> + _addr += MTE_GRANULE_SIZE;
> + } while (_addr < _end);

Is there any chance that this can be used for the last bytes of the
virtual address space? This might need to change to `_addr == _end` if
that is possible, otherwise it'll terminate early in that case.

> +}

What does the code generation look like for this, relative to the
assembly version?

Thanks,
Mark.

> +
>
> #else /* CONFIG_ARM64_MTE */
>
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 9e1a12e10053..a0a650451510 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags)
> ret
> SYM_FUNC_END(mte_restore_page_tags)
>
> -/*
> - * Assign allocation tags for a region of memory based on the pointer tag
> - * x0 - source pointer
> - * x1 - size
> - *
> - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
> - * size must be non-zero and MTE_GRANULE_SIZE aligned.
> - */
> -SYM_FUNC_START(mte_assign_mem_tag_range)
> -1: stg x0, [x0]
> - add x0, x0, #MTE_GRANULE_SIZE
> - subs x1, x1, #MTE_GRANULE_SIZE
> - b.gt 1b
> - ret
> -SYM_FUNC_END(mte_assign_mem_tag_range)
> --
> 2.30.0
>