Re: [Question]: major faults are still triggered after mlockall when numa balancing

From: Yin, Fengwei
Date: Thu Nov 09 2023 - 10:15:56 EST




On 11/9/2023 10:29 PM, Matthew Wilcox wrote:
> On Thu, Nov 09, 2023 at 03:11:41PM +0100, Peter Zijlstra wrote:
>> On Thu, Nov 09, 2023 at 09:47:24PM +0800, zhangpeng (AS) wrote:
>>> Is there any way to avoid such a major fault?
>>
>> man madvise
>
> but from the mlockall manpage:
>
> mlockall() locks all pages mapped into the address space of the calling
> process. This includes the pages of the code, data, and stack segment,
> as well as shared libraries, user space kernel data, shared memory, and
> memory-mapped files. All mapped pages are guaranteed to be resident in
> RAM when the call returns successfully; the pages are guaranteed to
> stay in RAM until later unlocked.
>
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/mlockall.html
> isn't quite so explicit, but I do think that page cache should be locked
> into memory.

Here is my understanding. It's related with write to a mlocked private file
mapping. From Peng:
"For the data segment, the global variable area is a private mapping".
So it's data segment of ELF file and mapped privately by ELF loader.

For this case, even ELF loader is updated to mlock the data segment, a
write will trigger COW and a new anonymous page will be allocated and
mlocked. The original file mapped page will be munlocked in
do_wp_page()
wp_page_copy()
if (old_folio) {
page_remove_rmap()
}
So it's possible the original file mapped page is reclaimed and later
accessing will trigger major fault.


Regards
Yin, Fengwei