Re: [PATCH 3/4] openrisc: Support floating point user api

From: Stafford Horne
Date: Tue Jun 27 2023 - 16:20:57 EST


On Tue, Jun 27, 2023 at 03:27:19PM -0400, Rich Felker wrote:
> On Tue, Jun 27, 2023 at 07:56:38PM +0200, Szabolcs Nagy wrote:
> > * Stafford Horne <shorne@xxxxxxxxx> [2023-06-27 17:41:03 +0100]:
> > > On Mon, Jun 26, 2023 at 11:38:40PM +0200, Szabolcs Nagy wrote:
> > > > * Stafford Horne <shorne@xxxxxxxxx> [2023-04-18 17:58:12 +0100]:
> > > > > Add support for handling floating point exceptions and forwarding the
> > > > > SIGFPE signal to processes. Also, add fpu state to sigcontext.
> > > > >
> > > > > Signed-off-by: Stafford Horne <shorne@xxxxxxxxx>
> > > > > ---
> > > > ...
> > > > > --- a/arch/openrisc/include/uapi/asm/sigcontext.h
> > > > > +++ b/arch/openrisc/include/uapi/asm/sigcontext.h
> > > > > @@ -28,6 +28,7 @@
> > > > >
> > > > > struct sigcontext {
> > > > > struct user_regs_struct regs; /* needs to be first */
> > > > > + struct __or1k_fpu_state fpu;
> > > > > unsigned long oldmask;
> > > > > };
> > > >
> > > > this seems to break userspace abi.
> > > > glibc and musl have or1k abi without this field.
> > > >
> > > > either this is a new abi where binaries opt-in with some marking
> > > > and then the base sigcontext should be unmodified,
> > > >
> > > > or the fp state needs to be added to the signal frame in a way that
> > > > does not break existing abi (e.g. end of the struct ?) and also
> > > > advertise the new thing via a hwcap, otherwise userspace cannot
> > > > make use of it.
> > > >
> > > > unless i'm missing something.
> > >
> > > I think you are right, I meant to look into this but it must have slipped
> > > though. Is this something causing you issues or did you just notice it?
> >
> > i noticed it while trying to update musl headers to linux 6.4 uapi.
> >
> > > I didn't run into issues when running the glibc test suite, but I may have
> > > missed it.
> >
> > i would only expect issues when accessing ucontext entries
> > after uc_mcontext.regs in a signal handler registered with
> > SA_SIGINFO.
> >
> > in particular uc_sigmask is after uc_mcontext on or1k and e.g.
> > musl thread cancellation uses this entry to affect the mask on
> > signal return which will not work on a 6.4 kernel (not tested).
> >
> > i don't think glibc has tests for the ucontext signal abi.
> >
> > > Just moving this to the end of the sigcontext may be all that is needed.
> >
> > that won't help since uc_sigmask comes after sigcontext in ucontext.
> > it has to go to the end of ucontext or outside of ucontext then.
> >
> > one way to have fpu in sigcontext is
> >
> > struct sigcontext {
> > struct user_regs_struct regs;
> > unsigned long oldmask;
> > char padding[sizeof(__userspace_sigset_t)];
> > struct __or1k_fpu_state fpu;
> > };
> >
> > but the kernel still has to interpret the padding in a bwcompat
> > way. (and if libc wants to expose fpu in its ucontext then it
> > needs a flag day abi break as the ucontext size is abi.)
> >
> > (part of the userspace uc_sigmask is unused because sigset_t is
> > larger than necessary so may be that can be reused but this is
> > a hack as that's libc owned.)
> >
> > not sure how important this fpu field is, arm does not seem to
> > have fpu state in ucontext and armhf works.
> >
> > there may be other ways, i'm adding Rich (musl maintainer) on cc
> > in case he has an opinion.
>
> Indeed, mcontext_t cannot be modified because uc_sigmask follows it in
> ucontext_t. The only clean solution here is probably to store the
> additional data at offsets past
>
> sizeof(struct sigcontext) + sizeof(sigset_t)
>
> and not expose this at all in the uapi types. Some hwcap flag can
> inform userspace that this additional space is present and accessible
> if that's needed.
>
> Optionally you could consider exposing this in the uapi headers'
> ucontext_t structure; whether it's an API breakage depends on whether
> userspace is relying on being able to allocate its own ucontext_t etc.
> This would leave the actual userspace headers (provided by libc) free
> to decide whether to modify their type or not according to an
> assessment of whether it's a breaking change to application linkage.
>
> What's not workable though is the ABI break that shipped in 6.4. It's
> a serious violation of "don't break userspace" and makes existing
> application binaries just *not work* (cancellation breaks and possibly
> corrupts program state). This needs to be reverted.

Hi Szabolcs, Rich,

Let me work on reverting the bits that try to expose fpcsr in sigcontext. I am
very aware of rules about not breaking userspace, but for some reason this was
completely missed.

I don't think we do have any need to expose this to userspace at the moment so I
prefer to just leave the fpu state out of sigcontext if that is usable.

The fix will take me about a day or two to get tested and sent.

-Stafford