Strange EFAULT on mips64el returned by syscall when another thread is forking

From: Xi Ruoyao
Date: Wed Jan 24 2024 - 05:43:43 EST


Hi,

When I'm testing Glibc master branch for upcoming 2.39 release, I
noticed an alarming test failure on mips64el:

FAIL: stdlib/tst-arc4random-thread

I've gathered some info about it and pasted my findings into
https://sourceware.org/glibc/wiki/Testing/Tests/stdlib/tst-arc4random-thread.

Finally I was able to reduce the test case into:

#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#include <unistd.h>
#include <fcntl.h>

void *
test_thread (void *)
{
char buf[16] = {};
int fd = open("/dev/zero", O_RDONLY);
while (1)
{
ssize_t ret = read (fd, buf, 7);
if (ret == -1 && errno == EFAULT)
abort ();
}
}

void *
fork_thread (void *)
{
while (1)
{
if (!fork ())
_exit (0);
}
}

int
main (void)
{
pthread_t test_th;
pthread_t fork_th;

pthread_create (&test_th, NULL, test_thread, NULL);
pthread_create (&fork_th, NULL, fork_thread, NULL);
pthread_join (test_th, NULL);
pthread_join (fork_th, NULL);
}

When running this on the mainline kernel (revision 6.8.0-rc1+-
g7ed2632ec7d72e926b9e8bcc9ad1bb0cd37274bf) it fails in milliseconds.
Some "interesting" aspects:

1. This is related to the size parameter passed to read (). When it's
less than 8 it fails, but when it's 8 or greater there is no failure.
2. This is not related to if "buf" is initialized or not.

Now I'm suspecting this might be a kernel bug. Any pointer to further
triage?

--
Xi Ruoyao <xry111@xxxxxxxxxxx>
School of Aerospace Science and Technology, Xidian University