Re: [PATCH v4 03/36] arm64/gcs: Document the ABI for Guarded Control Stacks

From: Szabolcs Nagy
Date: Tue Oct 03 2023 - 04:46:57 EST


The 10/02/2023 20:49, Mark Brown wrote:
> On Thu, Sep 28, 2023 at 05:59:25PM +0100, Szabolcs Nagy wrote:
> > The 08/23/2023 14:11, Catalin Marinas wrote:
>
> > > > and there is user code doing raw clone threads (such threads are
> > > > technically not allowed to call into libc) it's not immediately
> > > > clear to me if having gcs in those threads is better or worse.
>
> > i think raw clone / clone3 users may be relevant so we need a
> > solution such that they don't fail when gcs args are missing.
>
> Are we sure about that? Old binaries shouldn't be affected since they
> won't turn GCS so we're just talking about new binaries here - are there
> really so many of them that we won't be able to get them all converted
> over to clone3() and GCS in the timescales we're talking about for GCS
> deployment? I obviously don't particularly mind having the default size
> logic but if we allow clone() then that's keeping the existing behaviour
> and layering allocation via clone3() on top of it which Catalin didn't
> want. Catalin?

clone3 seems to have features that are only available in clone3 and
not exposed (reasonably) in libc apis so ppl will use clone3 directly
and those will be hard to fix for gcs (you have to convince upstream
to add future arm64 arch specific changes that they cannot test).
where this analysis might be wrong is that raw clone3 is more likely
used as fork/vfork without a new stack and thus no gcs issue.

even if we have time to fix code, we don't want too many ifdef hacks
just for gcs so it matters how many projects are affected.

> > userspace allocated gcs works for me, but maybe the alternative
> > with size only is more consistent (thread gcs is kernel mapped
> > with fallback size logic if gcs size is missing):
>
> If we have size only then the handling of GCS and normal stack in struct
> clone_args would be inconsistent. Given that it seems better to have
> the field present, we can allow it to be NULL and do the allocation with
> the specified size but it should be there.

i see, then try the original plan.

> > the main thread gcs is still special: the size is provided
> > via prctl (if at all).
>
> Either that or we have it do a map_shadow_stack() but that's an extra
> syscall during startup.

an extra syscall is not too bad for the gcs enabled case.