Re: [PATCH v2] x86: Fix x32 System V message queue syscalls

From: Rich Felker
Date: Mon Jul 31 2023 - 22:53:51 EST


On Tue, Aug 01, 2023 at 02:38:47AM +0100, Jessica Clarke wrote:
> On 1 Aug 2023, at 01:43, Harald van Dijk <harald@xxxxxxxxxxx> wrote:
> >
> > On 06/12/2020 22:55, Andy Lutomirski wrote:
> >> On Sat, Dec 5, 2020 at 4:01 PM Jessica Clarke <jrtc27@xxxxxxxxxx> wrote:
> >>>
> >>> Ping?
> >> Can you submit patches implementing my proposal? One is your existing
> >> patch plus fixing struct msghdr, with Cc: stable@xxxxxxxxxxxxxxx at
> >> the bottom. The second is a removal of struct msghdr from uapi,
> >> moving it into include/inux (no uapi) if needed. The second should
> >> not cc stable.
> >
> > Hi,
> >
> > This looks like it was forgotten, but it is still needed. Jessica,
> > are you interested in submitting the requested change? If not,
> > would it be okay if I do so? I have been running this locally for
> > a long time now.
>
> Hi,
> Please feel free to; sorry that it dropped off my radar. Part of the
> issue is my laptop no longer being x86, making it more annoying to test.
>
> > There is one complication that I think has not been mentioned yet:
> > when _GNU_SOURCE is defined, glibc does provide a definition of
> > struct msghdr in <sys/msg.h> with a field "__syscall_slong_t
> > mtype;". This makes it slightly more likely that there is code out
> > there in the wild that works fine with current kernels and would
> > be broken by the fix. Given how rare x32 is, and how rare message
> > queues are, this may still be acceptable, but I am mentioning it
> > just in case this would cause a different approach to be
> > preferred. And whatever is done, a fix should also be submitted to
> > glibc.
>
> Given POSIX is very clear on how msghdr works I think we have to break
> whatever oddball code out there might be using this. The alternative is
> violating POSIX in a way that makes correct code compile fine but fail
> at run time on x32, which is a terrible place to be, especially when
> the “fix” is to special-case x32 to go against what POSIX says. I just
> can’t see how that’s a good place to stay in, even if something might
> break when we fix this bug.

Absolutely. The application-facing API absolutely needs to have the
type of mtype be whatever long is in the application-facing C ABI.
However, I'm not sure how best to fix this. A fix now still leaves
applications broken on all existing kernels in the wild. This might be
a place where libc should have x32-specific translation code to work
around the wrong kernel ABI that became the contract with the kernel.
I'm not sure how practical this is, since it seems like it would
require a temp buffer. Is the message size sufficiently bounded to
make that reasonable? Should there me a new x32-specific syscall that
takes the right ABI so that translation is only needed on old kernels?

Rich