Re: mlockall(MCL_CURRENT) blocking infinitely

From: Michal Hocko
Date: Fri Oct 25 2019 - 07:46:37 EST


On Fri 25-10-19 13:02:23, Robert Stupp wrote:
> On Fri, 2019-10-25 at 11:21 +0200, Michal Hocko wrote:
> > On Thu 24-10-19 16:34:46, Randy Dunlap wrote:
> > > [adding linux-mm + people]
> > >
> > > On 10/24/19 12:36 AM, Robert Stupp wrote:
> > > > Hi guys,
> > > >
> > > > I've got an issue with `mlockall(MCL_CURRENT)` after upgrading
> > > > Ubuntu 19.04 to 19.10 - i.e. kernel version change from 5.0.x to
> > > > 5.3.x.
> > > >
> > > > The following simple program hangs forever with one CPU running
> > > > at 100% (kernel):
> >
> > Can you capture everal snapshots of proc/$(pidof $YOURTASK)/stack
> > while
> > this is happening?
>
> Sure,
>
> Approach:
> - one shell running
> while true; do cat /proc/$(pidof test)/stack; done
> - starting ./test in another shell + ctrl-c quite some times
>
> Vast majority of all ./test invocations return an empty 'stack' file.
> Some tries, maybe 1 out of 20, returned these snapshots.
> Was running 5.3.7 for this test.
>
>
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>
> [<0>] __do_fault+0x3c/0x130
> [<0>] do_fault+0x248/0x640
> [<0>] __handle_mm_fault+0x4c5/0x7a0
> [<0>] handle_mm_fault+0xca/0x1f0
> [<0>] __get_user_pages+0x230/0x770
> [<0>] populate_vma_page_range+0x74/0x80
> [<0>] __mm_populate+0xb1/0x150
> [<0>] __x64_sys_mlockall+0x11c/0x190
> [<0>] do_syscall_64+0x5a/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>

This is expected.

> // doubt this one is relevant
> [<0>] __wake_up_common_lock+0x7c/0xc0
> [<0>] __wake_up_sync_key+0x1e/0x30
> [<0>] __wake_up_parent+0x26/0x30
> [<0>] do_notify_parent+0x1cc/0x280
> [<0>] do_exit+0x703/0xaf0
> [<0>] do_group_exit+0x47/0xb0
> [<0>] get_signal+0x165/0x880
> [<0>] do_signal+0x34/0x280
> [<0>] exit_to_usermode_loop+0xbf/0x160
> [<0>] do_syscall_64+0x10f/0x130
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Hmm, this means that the task has exited so how come there are
other syscalls happening. Are you sure you are collecting stacks for the
correct task?
--
Michal Hocko
SUSE Labs