Re: [PATCH v18 2/5] fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs

From: Muhammad Usama Anjum
Date: Thu Jun 15 2023 - 11:16:36 EST


Please review the v19. I hope to get your reviewed by tag soon.

On 6/15/23 7:58 PM, Michał Mirosław wrote:
> On Thu, 15 Jun 2023 at 16:52, Michał Mirosław <emmir@xxxxxxxxxx> wrote:
>> On Thu, 15 Jun 2023 at 15:58, Muhammad Usama Anjum
>> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>> I'll send next revision now.
>>> On 6/14/23 11:00 PM, Michał Mirosław wrote:
>>>> (A quick reply to answer open questions in case they help the next version.)
>>>>
>>>> On Wed, 14 Jun 2023 at 19:10, Muhammad Usama Anjum
>>>> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>>>> On 6/14/23 8:14 PM, Michał Mirosław wrote:
>>>>>> On Wed, 14 Jun 2023 at 15:46, Muhammad Usama Anjum
>>>>>> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>>>>>>
>>>>>>> On 6/14/23 3:36 AM, Michał Mirosław wrote:
>>>>>>>> On Tue, 13 Jun 2023 at 12:29, Muhammad Usama Anjum
>>>>>>>> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>>>>>>> For flags name: PM_REQUIRE_WRITE_ACCESS?
>>>>>>>> Or Is it intended to be checked only if doing WP (as the current name
>>>>>>>> suggests) and so it would be redundant as WP currently requires
>>>>>>>> `p->required_mask = PAGE_IS_WRITTEN`?
>>>>>>> This is intended to indicate that if userfaultfd is needed. If
>>>>>>> PAGE_IS_WRITTEN is mentioned in any of mask, we need to check if
>>>>>>> userfaultfd has been initialized for this memory. I'll rename to
>>>>>>> PM_SCAN_REQUIRE_UFFD.
>>>>>>
>>>>>> Why do we need that check? Wouldn't `is_written = false` work for vmas
>>>>>> not registered via uffd?
>>>>> UFFD_FEATURE_WP_ASYNC and UNPOPULATED needs to be set on the memory region
>>>>> for it to report correct written values on the memory region. Without UFFD
>>>>> WP ASYNC and UNPOUPULATED defined on the memory, we consider UFFD_WP state
>>>>> undefined. If user hasn't initialized memory with UFFD, he has no right to
>>>>> set is_written = false.
>>>>
>>>> How about calculating `is_written = is_uffd_registered() &&
>>>> is_uffd_wp()`? This would enable a user to apply GET+WP for the whole
>>>> address space of a process regardless of whether all of it is
>>>> registered.
>>> I wouldn't want to check if uffd is registered again and again. This is why
>>> we are doing it only once every walk in pagemap_scan_test_walk().
>>
>> There is no need to do the checks repeatedly. If I understand the code
>> correctly, uffd registration is per-vma, so it can be communicated
>> from test_walk to entry/hole callbacks via a field in
>> pagemap_scan_private.
>
> Actually... this could be exposed as a page category for the filter
> (e.g. PAGE_USES_UFFD_WP) and then you could just make the ioctl() to
> work for your usecase without tracking the ranges at the userspace
> side.
I'm not sure about page category. ASAIK the current check isn't bad when we
already mention in documentation that memory must be registered with UFFD
WP before using write feature of the IOCTL.

Just like mincore mentions in documentation that user buffer will be filled
with values based on the length of the region. Kernel doesn't care if user
had provided smaller buffer and kernel overwrites because of user's own
issue. I want to follow the same path. If user doesn't read documentation
and follow it, he should be punished with the error.

>
> Best Regards
> Michał Mirosław

--
BR,
Muhammad Usama Anjum