Re: [PATCH 3/4] openrisc: Support floating point user api

From: Stafford Horne
Date: Sun Jul 23 2023 - 17:05:07 EST


On Tue, Jun 27, 2023 at 03:27:19PM -0400, Rich Felker wrote:
> On Tue, Jun 27, 2023 at 07:56:38PM +0200, Szabolcs Nagy wrote:
> > * Stafford Horne <shorne@xxxxxxxxx> [2023-06-27 17:41:03 +0100]:
> > > On Mon, Jun 26, 2023 at 11:38:40PM +0200, Szabolcs Nagy wrote:
> > > > * Stafford Horne <shorne@xxxxxxxxx> [2023-04-18 17:58:12 +0100]:
> > > > > Add support for handling floating point exceptions and forwarding the
> > > > > SIGFPE signal to processes. Also, add fpu state to sigcontext.
> > > > >
> > > > > Signed-off-by: Stafford Horne <shorne@xxxxxxxxx>
> > > > > ---
> > > > ...
> > > > > --- a/arch/openrisc/include/uapi/asm/sigcontext.h
> > > > > +++ b/arch/openrisc/include/uapi/asm/sigcontext.h
> > > > > @@ -28,6 +28,7 @@
> > > > >
> > > > > struct sigcontext {
> > > > > struct user_regs_struct regs; /* needs to be first */
> > > > > + struct __or1k_fpu_state fpu;
> > > > > unsigned long oldmask;
> > > > > };
> > > >
> > > > this seems to break userspace abi.
> > > > glibc and musl have or1k abi without this field.
> > > >
> > > > either this is a new abi where binaries opt-in with some marking
> > > > and then the base sigcontext should be unmodified,
> > > >
> > > > or the fp state needs to be added to the signal frame in a way that
> > > > does not break existing abi (e.g. end of the struct ?) and also
> > > > advertise the new thing via a hwcap, otherwise userspace cannot
> > > > make use of it.
> > > >
> > > > unless i'm missing something.
> > >
> > > I think you are right, I meant to look into this but it must have slipped
> > > though. Is this something causing you issues or did you just notice it?
> >
> > i noticed it while trying to update musl headers to linux 6.4 uapi.
> >
> > > I didn't run into issues when running the glibc test suite, but I may have
> > > missed it.
> >
> > i would only expect issues when accessing ucontext entries
> > after uc_mcontext.regs in a signal handler registered with
> > SA_SIGINFO.
> >
> > in particular uc_sigmask is after uc_mcontext on or1k and e.g.
> > musl thread cancellation uses this entry to affect the mask on
> > signal return which will not work on a 6.4 kernel (not tested).
> >
> > i don't think glibc has tests for the ucontext signal abi.
> >
> > > Just moving this to the end of the sigcontext may be all that is needed.
> >
> > that won't help since uc_sigmask comes after sigcontext in ucontext.
> > it has to go to the end of ucontext or outside of ucontext then.
> >
> > one way to have fpu in sigcontext is
> >
> > struct sigcontext {
> > struct user_regs_struct regs;
> > unsigned long oldmask;
> > char padding[sizeof(__userspace_sigset_t)];
> > struct __or1k_fpu_state fpu;
> > };
> >
> > but the kernel still has to interpret the padding in a bwcompat
> > way. (and if libc wants to expose fpu in its ucontext then it
> > needs a flag day abi break as the ucontext size is abi.)
> >
> > (part of the userspace uc_sigmask is unused because sigset_t is
> > larger than necessary so may be that can be reused but this is
> > a hack as that's libc owned.)
> >
> > not sure how important this fpu field is, arm does not seem to
> > have fpu state in ucontext and armhf works.
> >
> > there may be other ways, i'm adding Rich (musl maintainer) on cc
> > in case he has an opinion.
>
> Indeed, mcontext_t cannot be modified because uc_sigmask follows it in
> ucontext_t. The only clean solution here is probably to store the
> additional data at offsets past
>
> sizeof(struct sigcontext) + sizeof(sigset_t)
>
> and not expose this at all in the uapi types. Some hwcap flag can
> inform userspace that this additional space is present and accessible
> if that's needed.
>
> Optionally you could consider exposing this in the uapi headers'
> ucontext_t structure; whether it's an API breakage depends on whether
> userspace is relying on being able to allocate its own ucontext_t etc.
> This would leave the actual userspace headers (provided by libc) free
> to decide whether to modify their type or not according to an
> assessment of whether it's a breaking change to application linkage.
>
> What's not workable though is the ABI break that shipped in 6.4. It's
> a serious violation of "don't break userspace" and makes existing
> application binaries just *not work* (cancellation breaks and possibly
> corrupts program state). This needs to be reverted.

Hi Szabolcs, Rich,

My fix for this has now made it into 6.4.5 stable release. You should be able
to use this release to update musl.

The fix was to use some unused space in sigcontext, for the fpu state.

-Stafford