Re: [PATCH] arm64/signal: Don't assume that TIF_SVE means we saved SVE state

From: Dave Martin
Date: Tue Jan 30 2024 - 09:45:07 EST


On Tue, Jan 30, 2024 at 02:09:34PM +0000, Mark Brown wrote:
> On Tue, Jan 30, 2024 at 11:51:07AM +0000, Will Deacon wrote:
> > On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:
>
> > > - if (test_thread_flag(TIF_SVE))
> > > + if (current->thread.fp_type == FP_STATE_SVE)
> > > sve_to_fpsimd(current);
> > > }
>
> > I don't think this hunk applies on -rc2 ^^.
>
> Hrm, git seemed to figure out a rebase with no intervention - I've
> thrown it at my CI and will resend assuming no changes from the rest of
> the discussion.
>
> > > - if (add_all || test_thread_flag(TIF_SVE) ||
> > > + if (add_all || current->thread.fp_type == FP_STATE_SVE ||
> > > thread_sm_enabled(&current->thread)) {
> > > int vl = max(sve_max_vl(), sme_max_vl());
>
> > I think this code is preemptible, so I'm struggling to understand what
> > happens if the fp_type changes under our feet as a result of a context
> > switch.
>
> We are relying here on having forced a flush of the floating point
> register state prior to this code running, simple preemption won't
> change the state from what was already saved. The same consideration
> also applies to the check for streaming mode here.
>
> That said if this is preempted ptrace *could* come in and rewrite the
> data or at worst change the vector length (which could leave us with
> sve_state deallocated or a different size, possibly while we're in the
> middle of accessing it). This could also happen with the existing check
> for TIF_SVE so I don't think there's anything new here - AFAICT this has
> always been an issue with the vector code, unless I'm missing some
> bigger thing which excludes ptrace. I think any change that's needed
> there won't overlap with this one, I'm looking.

I'm pretty sure that terrible things will happen treewide if ptrace can
ever access or manipulate the internal state of a _running_ task.

I think the logic is that any ptrace call that can access or manipulate
the state of a task is gated on the task being ptrace-stopped. Once we
have committed to deliveing a signal, we have obviously run past the
opportunity to stop (and hence be ptraced) on that signal.

Cases where a multiple signals are delivered before acutally reaching
userspace might want some thought.

I haven't tracked down the smokeproof gun in the code yet, though.


>From memory, I think that the above forced flush was there to protect
against the context switch code rather than ptrace, and guarantees that
any change that ctxsw _might_ spontaneously make to the task state has
already been done and dusted before we do the actual signal delivery.
This may be a red herring so far as ptrace hazards are concerned.

Cheers
---Dave