Re: [PATCH v4 19/36] arm64/gcs: Allocate a new GCS for threads with GCS enabled

From: Catalin Marinas
Date: Fri Aug 11 2023 - 12:26:15 EST


On Mon, Aug 07, 2023 at 11:00:24PM +0100, Mark Brown wrote:
> diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c
> index b0a67efc522b..1e059c37088d 100644
> --- a/arch/arm64/mm/gcs.c
> +++ b/arch/arm64/mm/gcs.c
> @@ -8,6 +8,62 @@
> #include <asm/cpufeature.h>
> #include <asm/page.h>
>
> +static unsigned long alloc_gcs(unsigned long addr, unsigned long size,
> + unsigned long token_offset, bool set_res_tok)
> +{
> + int flags = MAP_ANONYMOUS | MAP_PRIVATE;
> + struct mm_struct *mm = current->mm;
> + unsigned long mapped_addr, unused;
> +
> + if (addr)
> + flags |= MAP_FIXED_NOREPLACE;
> +
> + mmap_write_lock(mm);
> + mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags,
> + VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL);

Why not PROT_WRITE as well? I guess I need to check the x86 patches
since the do_mmap() called here has a different prototype than what's in
mainline.

This gets confusing since currently the VM_* flags are derived from the
PROT_* flags passed to mmap(). But you skip the PROT_WRITE in favour of
adding VM_WRITE directly.

I haven't followed the x86 discussion but did we run out of PROT_* bits
for a PROT_SHADOW_STACK?

> + mmap_write_unlock(mm);
> +
> + return mapped_addr;
> +}
> +
> +static unsigned long gcs_size(unsigned long size)
> +{
> + if (size)
> + return PAGE_ALIGN(size);
> +
> + /* Allocate RLIMIT_STACK with limits of PAGE_SIZE..4G */
> + size = PAGE_ALIGN(min_t(unsigned long long,
> + rlimit(RLIMIT_STACK), SZ_4G));
> + return max(PAGE_SIZE, size);
> +}

I saw Szabolcs commenting on the default size as well. Maybe we should
go for RLIMIT_STACK/2 but let's see how the other sub-thread is going.

> +
> +unsigned long gcs_alloc_thread_stack(struct task_struct *tsk,
> + unsigned long clone_flags, size_t size)
> +{
> + unsigned long addr;
> +
> + if (!system_supports_gcs())
> + return 0;
> +
> + if (!task_gcs_el0_enabled(tsk))
> + return 0;
> +
> + if ((clone_flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM)
> + return 0;

Is it safe for CLONE_VFORK not to get a new shadow stack? A syscall for
exec could push something to the stack. I guess the GCS pointer in the
parent stays the same, so it wouldn't matter.

That said, I think this check should be somewhere higher up in the
caller of gcs_alloc_thread_stack(). The copy_thread_gcs() function
already does most of the above checks. Is the GCS allocation called from
elsewhere as well?

--
Catalin