Re: [PATCH v30 2/6] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs

From: Michał Mirosław
Date: Wed Aug 16 2023 - 05:46:50 EST


On Wed, Aug 16, 2023 at 11:59:21AM +0500, Muhammad Usama Anjum wrote:
> The PAGEMAP_SCAN IOCTL on the pagemap file can be used to get or optionally
> clear the info about page table entries.
[...]
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
[...]
> +static long do_pagemap_scan(struct mm_struct *mm, unsigned long uarg)
> +{
[...]
> + for (walk_start = p.arg.start; walk_start < p.arg.end;
> + walk_start = p.arg.walk_end) {
> + long n_out;
> +
> + if (fatal_signal_pending(current)) {
> + ret = -EINTR;
> + break;
> + }
> +
> + ret = mmap_read_lock_killable(mm);
> + if (ret)
> + break;
> + ret = walk_page_range(mm, walk_start, p.arg.end,
> + &pagemap_scan_ops, &p);
> + mmap_read_unlock(mm);
> +
> + n_out = pagemap_scan_flush_buffer(&p);
> + if (n_out < 0)
> + ret = n_out;
> + else
> + n_ranges_out += n_out;
> +
> + p.arg.walk_end = p.walk_end_addr ? p.walk_end_addr : p.arg.end;

I think p.walk_end_addr can be removed and replaced by `p.arg.walk_end`
directly in the walk functions. If we don't set walk_end_addr we'll also
return 0 so the check below will match. Might be good to add this as
a comment.

> + if (ret != -ENOSPC)
> + break;
> +
> + if (p.arg.vec_len == 0 || p.found_pages == p.arg.max_pages)
> + break;
> + }
> +
> + /* ENOSPC signifies early stop (buffer full) from the walk. */
> + if (!ret || ret == -ENOSPC)
> + ret = n_ranges_out;
> +
> + p.arg.walk_end = p.arg.walk_end ? p.arg.walk_end : walk_start;

When the walk is finished, with ret == 0, the walk_start will point to
the beginning, not the end of the range. So:

if (!walk_end) walk_end = p.arg.end;

Other than that, the patch looks complete now. Thanks for all your work!

Best Regards
Michał Mirosław