Re: [PATCH V2,2/2] mm: madvise: skip unmapped vma holes passed to process_madvise

From: Michal Hocko
Date: Mon Mar 21 2022 - 11:34:15 EST


On Fri 11-03-22 20:59:06, Charan Teja Kalla wrote:
> The process_madvise() system call is expected to skip holes in vma
> passed through 'struct iovec' vector list.

Where is this assumption coming from? From the man page I can see:
: The advice might be applied to only a part of iovec if one of its
: elements points to an invalid memory region in the remote
: process. No further elements will be processed beyond that
: point.

> But do_madvise, which
> process_madvise() calls for each vma, returns ENOMEM in case of unmapped
> holes, despite the VMA is processed.
> Thus process_madvise() should treat ENOMEM as expected and consider the
> VMA passed to as processed and continue processing other vma's in the
> vector list. Returning -ENOMEM to user, despite the VMA is processed,
> will be unable to figure out where to start the next madvise.

I am not sure I follow. With your previous patch and -ENOMEM from
do_madvise you get the the answer you are looking for, no?
With this applied you are loosing the information that some of the iters
are not mapped or has a hole. Which might be a useful information
especially when processing on remote tasks which are free to manipulate
their address spaces.

> Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
> Cc: <stable@xxxxxxxxxxxxxxx> # 5.10+
> Signed-off-by: Charan Teja Kalla <quic_charante@xxxxxxxxxxx>
> ---
> Changes in V2:
> -- Fixed handling of ENOMEM by process_madvise().
> -- Patch doesn't exist in V1.
>
> mm/madvise.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index e97e6a9..14fb76d 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1426,9 +1426,16 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>
> while (iov_iter_count(&iter)) {
> iovec = iov_iter_iovec(&iter);
> + /*
> + * do_madvise returns ENOMEM if unmapped holes are present
> + * in the passed VMA. process_madvise() is expected to skip
> + * unmapped holes passed to it in the 'struct iovec' list
> + * and not fail because of them. Thus treat -ENOMEM return
> + * from do_madvise as valid and continue processing.
> + */
> ret = do_madvise(mm, (unsigned long)iovec.iov_base,
> iovec.iov_len, behavior);
> - if (ret < 0)
> + if (ret < 0 && ret != -ENOMEM)
> break;
> iov_iter_advance(&iter, iovec.iov_len);
> }
> --
> 2.7.4

--
Michal Hocko
SUSE Labs