Re: mlockall(MCL_CURRENT) blocking infinitely

From: Randy Dunlap
Date: Thu Oct 24 2019 - 19:36:18 EST


[adding linux-mm + people]

I see only one change in the last 4 years:

commit dedca63504a204dc8410d98883fdc16dffa8cb80
Author: Potyra, Stefan <Stefan.Potyra@xxxxxxxxxxxxxx>
Date: Thu Jun 13 15:55:55 2019 -0700

mm/mlock.c: mlockall error for flag MCL_ONFAULT


On 10/24/19 12:36 AM, Robert Stupp wrote:
> Hi guys,
>
> I've got an issue with `mlockall(MCL_CURRENT)` after upgrading Ubuntu 19.04 to 19.10 - i.e. kernel version change from 5.0.x to 5.3.x.
>
> The following simple program hangs forever with one CPU running at 100% (kernel):
>
> #include <stdio.h>
> #include <sys/mman.h>
> int main(char** argv) {
> Â printf("Before mlockall(MCL_CURRENT)\n");
> Â // works in 5.0
> Â // hangs forever w/ 5.1 and newer
> Â mlockall(MCL_CURRENT);
> Â printf("After mlockall(MCL_CURRENT)\n");
> }
>
> All kernel versions since 5.1 (tried 5.1.0, 5.1.21, 5.2.21, 5.3.0-19, 5.3.7, 5.4-rc4) show the same symptom (hanging in mlockall(MCL_CURRENT) with 100% kernel-CPU). 5.0 kernel versions (5.0.21) are fine.
>
> First, I thought, that it's something generic, so I tried the above program in a fresh install of Ubuntu eoan (5.3.x) in a VM in virtualbox, but it works fine there. So I suspect, that it has to do with something that's specific to my machine.
>
> My first suspicion was that some library "hijacks" mlockall(), but calling the test program with `LD_DEBUG=all` shows that glibc gets called directly:
> ÂÂÂÂ 12248:ÂÂÂ symbol=mlockall;Â lookup in file=./test [0]
> ÂÂÂÂ 12248:ÂÂÂ symbol=mlockall;Â lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
> ÂÂÂÂ 12248:ÂÂÂ binding file ./test [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `mlockall' [GLIBC_2.2.5]
> An `strace` doesn't show anything meaningful (beside that mlockall's been called but never returns). dmesg and syslog don't show anything obvious (to me) as well.
>
> Some information about the machine:
> - Intel(R) Core(TM) i7-6900K, Intel X99 chipset
> - NVMe 1.1b
> - 64GB RAM (4x 16GB)
>
> I've also reverted all changes for sysctl and ld.conf and checked for other suspicious software without any luck.
>
> I also tried a bunch of variations of the above program, but only `mlockall(MCL_CURRENT)` or `mlockall(MCL_FUTURE | MCL_CURRENT)` hang.
>
> A `git diff v5.0..v5.1 mm/` doesn't show anything obvious (to me).
>
> It seems, there's no debug/trace information that would help to find out what exactly it's doing.
>
> I'm kinda lost at the moment.
>
>
> PS: Variations of the above test program:
>
> #include <stdio.h>
> #include <sys/mman.h>
> char foo[65536];
> int main(char** argv) {
> Â printf("Before mlock()\n");
> Â int e = mlock(foo, 8192); // works in 5.0, 5.1, 5.2, 5.3, 5.4
> Â printf("After mlock()=%d\n", e);
> }
>
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> int main(char** argv) {
> Â printf("Before mlockall(MCL_FUTURE)\n");
> Â int e = mlockall(MCL_FUTURE); // works in 5.0, 5.1, 5.2, 5.3, 5.4
> Â printf("After mlockall(MCL_FUTURE) = %d\n", e);
> Â void* mem = malloc(1024 * 1024 * 1024);
> Â printf("After malloc()\n");
> Â mem = malloc(1024 * 1024 * 1024);
> Â printf("After malloc()\n");
> Â mem = malloc(1024 * 1024 * 1024);
> Â printf("After malloc()\n");
> Â // works in 5.0, 5.1, 5.2, 5.3, 5.4
> }
>
>
> #include <stdio.h>
> #include <sys/mman.h>
> int main(char** argv) {
> Â printf("Before munlockall()\n");
> Â int e = munlockall(); // works in 5.0, 5.1, 5.2, 5.3, 5.4
> Â printf("After munlockall() = %d\n", e);
> }
>
>
> #include <stdio.h>
> #include <sys/mman.h>
> int main(char** argv) {
> Â printf("Before mlockall(MCL_CURRENT|MCL_FUTURE)\n");
> Â // works in 5.0
> Â // hangs forever w/ 5.1 and newer
> Â int e = mlockall(MCL_CURRENT|MCL_FUTURE);
> Â printf("After mlockall(MCL_CURRENT|MCL_FUTURE) = %d\n", e);
> }
>
> PPS: Kernel version images installed from https://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D
>


--
~Randy