Re: [PATCH v30 2/6] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs

From: Muhammad Usama Anjum
Date: Wed Aug 16 2023 - 06:29:28 EST


On 8/16/23 2:45 PM, Michał Mirosław wrote:
> On Wed, Aug 16, 2023 at 11:59:21AM +0500, Muhammad Usama Anjum wrote:
>> The PAGEMAP_SCAN IOCTL on the pagemap file can be used to get or optionally
>> clear the info about page table entries.
> [...]
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
> [...]
>> +static long do_pagemap_scan(struct mm_struct *mm, unsigned long uarg)
>> +{
> [...]
>> + for (walk_start = p.arg.start; walk_start < p.arg.end;
>> + walk_start = p.arg.walk_end) {
>> + long n_out;
>> +
>> + if (fatal_signal_pending(current)) {
>> + ret = -EINTR;
>> + break;
>> + }
>> +
>> + ret = mmap_read_lock_killable(mm);
>> + if (ret)
>> + break;
>> + ret = walk_page_range(mm, walk_start, p.arg.end,
>> + &pagemap_scan_ops, &p);
>> + mmap_read_unlock(mm);
>> +
>> + n_out = pagemap_scan_flush_buffer(&p);
>> + if (n_out < 0)
>> + ret = n_out;
>> + else
>> + n_ranges_out += n_out;
>> +
>> + p.arg.walk_end = p.walk_end_addr ? p.walk_end_addr : p.arg.end;
>
> I think p.walk_end_addr can be removed and replaced by `p.arg.walk_end`
> directly in the walk functions. If we don't set walk_end_addr we'll also
> return 0 so the check below will match. Might be good to add this as
> a comment.
I'll remove it and add a short comment.

>
>> + if (ret != -ENOSPC)
>> + break;
>> +
>> + if (p.arg.vec_len == 0 || p.found_pages == p.arg.max_pages)
>> + break;
>> + }
>> +
>> + /* ENOSPC signifies early stop (buffer full) from the walk. */
>> + if (!ret || ret == -ENOSPC)
>> + ret = n_ranges_out;
>> +
>> + p.arg.walk_end = p.arg.walk_end ? p.arg.walk_end : walk_start;
>
> When the walk is finished, with ret == 0, the walk_start will point to
> the beginning, not the end of the range. So:
>
> if (!walk_end) walk_end = p.arg.end;
This condition is to cater for the case when for loop doesn't execute at
all because the address range was zero. In that case start == end. So
p.arg.start or p.arg.end both would work fine. I'll add p.arg.end in
accordance to above loop.

>
> Other than that, the patch looks complete now. Thanks for all your work!
I'll send the next revision.

>
> Best Regards
> Michał Mirosław

--
BR,
Muhammad Usama Anjum