Re: siginfo_t ABI break on sparc64 from si_addr_lsb move 3y ago

From: Eric W. Biederman
Date: Fri Apr 30 2021 - 16:15:38 EST


Marco Elver <elver@xxxxxxxxxx> writes:

> On Fri, Apr 30, 2021 at 12:08PM -0500, Eric W. Biederman wrote:
>> Arnd Bergmann <arnd@xxxxxxxx> writes:
> [...]
>> >> I did a quick search and the architectures that define __ARCH_SI_TRAPNO
>> >> are sparc, mips, and alpha. All have 64bit implementations. A further
>> >> quick search shows that none of those architectures have faults that
>> >> use BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR, nor do
>> >> they appear to use mm/memory-failure.c
>> >>
>> >> So it doesn't look like we have an ABI regression to fix.
>> >
>> > Even better!
>> >
>> > So if sparc is the only user of _trapno and it uses none of the later
>> > fields in _sigfault, I wonder if we could take even more liberty at
>> > trying to have a slightly saner definition. Can you think of anything that
>> > might break if we put _trapno inside of the union along with _perf
>> > and _addr_lsb?
>>
>> On sparc si_trapno is only set when SIGILL ILL_TRP is set. So we can
>> limit si_trapno to that combination, and it should not be a problem for
>> a new signal/si_code pair to use that storage. Precisely because it is
>> new.
>>
>> Similarly on alpha si_trapno is only set for:
>>
>> SIGFPE {FPE_INTOVF, FPE_INTDIV, FPE_FLTOVF, FPE_FLTDIV, FPE_FLTUND,
>> FPE_FLTINV, FPE_FLTRES, FPE_FLTUNK} and SIGTRAP {TRAP_UNK}.
>>
>> Placing si_trapno into the union would also make the problem that the
>> union is pointer aligned a non-problem as then the union immediate
>> follows a pointer.
>>
>> I hadn't had a chance to look before but we must deal with this. The
>> definition of perf_sigtrap in 42dec9a936e7696bea1f27d3c5a0068cd9aa95fd
>> is broken on sparc, alpha, and ia64 as it bypasses the code in
>> kernel/signal.c that ensures the si_trapno or the ia64 special fields
>> are set.
>>
>> Not to mention that perf_sigtrap appears to abuse si_errno.
>
> There are a few other places in the kernel that repurpose si_errno
> similarly, e.g. arch/arm64/kernel/ptrace.c, kernel/seccomp.c -- it was
> either that or introduce another field or not have it. It is likely we
> could do without, but if there are different event types the user would
> have to sacrifice a few bits of si_perf to encode the event type, and
> I'd rather keep those bits for something else. Thus the decision fell to
> use si_errno.

arm64 only abuses si_errno in compat code for bug compatibility with
arm32.

> Given it'd be wasted space otherwise, and we define the semantics of
> whatever is stored in siginfo on the new signal, it'd be good to keep.

Except you don't completely. You are not defining a new signal. You
are extending the definition of SIGTRAP. Anything generic that
responds to all SIGTRAPs can reasonably be looking at si_errno.

Further you are already adding a field with si_perf you can just as
easily add a second field with well defined semantics for that data.

>> The code is only safe if the analysis that says we can move si_trapno
>> and perhaps the ia64 fields into the union is correct. It looks like
>> ia64 much more actively uses it's signal extension fields including for
>> SIGTRAP, so I am not at all certain the generic definition of
>> perf_sigtrap is safe on ia64.
>
> Trying to understand the requirements of si_trapno myself: safe here
> would mean that si_trapno is not required if we fire our SIGTRAP /
> TRAP_PERF.
>
> As far as I can tell that is the case -- see below.
>
>> > I suppose in theory sparc64 or alpha might start using the other
>> > fields in the future, and an application might be compiled against
>> > mismatched headers, but that is unlikely and is already broken
>> > with the current headers.
>>
>> If we localize the use of si_trapno to just a few special cases on alpha
>> and sparc I think we don't even need to worry about breaking userspace
>> on any architecture. It will complicate siginfo_layout, but it is a
>> complication that reflects reality.
>>
>> I don't have a clue how any of this affects ia64. Does perf work on
>> ia64? Does perf work on sparc, and alpha?
>>
>> If perf works on ia64 we need to take a hard look at what is going on
>> there as well.
>
> No perf on ia64, but it seems alpha and sparc have perf:
>
> $ git grep 'select.*HAVE_PERF_EVENTS$' -- arch/
> arch/alpha/Kconfig: select HAVE_PERF_EVENTS <--
> arch/arc/Kconfig: select HAVE_PERF_EVENTS
> arch/arm/Kconfig: select HAVE_PERF_EVENTS
> arch/arm64/Kconfig: select HAVE_PERF_EVENTS
> arch/csky/Kconfig: select HAVE_PERF_EVENTS
> arch/hexagon/Kconfig: select HAVE_PERF_EVENTS
> arch/mips/Kconfig: select HAVE_PERF_EVENTS
> arch/nds32/Kconfig: select HAVE_PERF_EVENTS
> arch/parisc/Kconfig: select HAVE_PERF_EVENTS
> arch/powerpc/Kconfig: select HAVE_PERF_EVENTS
> arch/riscv/Kconfig: select HAVE_PERF_EVENTS
> arch/s390/Kconfig: select HAVE_PERF_EVENTS
> arch/sh/Kconfig: select HAVE_PERF_EVENTS
> arch/sparc/Kconfig: select HAVE_PERF_EVENTS <--
> arch/x86/Kconfig: select HAVE_PERF_EVENTS
> arch/xtensa/Kconfig: select HAVE_PERF_EVENTS
>
> Now, given ia64 is not an issue, I wanted to understand the semantics of
> si_trapno. Per https://man7.org/linux/man-pages/man2/sigaction.2.html, I
> see:
>
> int si_trapno; /* Trap number that caused
> hardware-generated signal
> (unused on most architectures) */
>
> ... its intended semantics seem to suggest it would only be used by some
> architecture-specific signal like you identified above. So if the
> semantics is some code of a hardware trap/fault, then we're fine and do
> not need to set it.
>
> Also bearing in mind we define the semantics any new signal, and given
> most architectures do not have si_trapno, definitions of new generic
> signals should probably not include odd architecture specific details
> related to old architectures.
>
> From all this, my understanding now is that we can move si_trapno into
> the union, correct? What else did you have in mind?

Yes. Let's move si_trapno into the union.

That implies a few things like siginfo_layout needs to change.

The helpers in kernel/signal.c can change to not imply that
if you define __ARCH_SI_TRAPNO you must always define and
pass in si_trapno. A force_sig_trapno could be defined instead
to handle the cases that alpha and sparc use si_trapno.

It would be nice if a force_sig_perf_trap could be factored
out of perf_trap and placed in kernel/signal.c.

My experience (especially this round) is that it becomes much easier to
audit the users of siginfo if there is a dedicated function in
kernel/signal.c that is simply passed the parameters that need
to be placed in siginfo.

So I would very much like to see if I can make force_sig_info static.

Eric