Re: [PATCH RFC] hugetlbfs 'noautofill' mount option

From: Dave Hansen
Date: Tue May 02 2017 - 19:43:57 EST


On 05/02/2017 04:34 PM, Prakash Sangappa wrote:
> Similarly, a madvise() option also requires additional system call by every
> process mapping the file, this is considered a overhead for the database.

How long-lived are these processes? For a database, I'd assume that
this would happen a single time, or a single time per mmap() at process
startup time. Such a syscall would be doing something on the order of
taking mmap_sem, walking the VMA tree, setting a bit per VMA, and
unlocking. That's a pretty cheap one-time cost...

> If we do consider a new madvise() option, will it be acceptable
> since this will be specifically for hugetlbfs file mappings?

Ideally, it would be something that is *not* specifically for hugetlbfs.
MADV_NOAUTOFILL, for instance, could be defined to SIGSEGV whenever
memory is touched that was not populated with MADV_WILLNEED, mlock(), etc...

> If so,
> would a new flag to mmap() call itself be acceptable, which would
> define the proposed behavior?. That way no additional system calls
> need to be made.

I don't feel super strongly about it, but I guess an mmap() flag could
work too.