Re: [PATCH v3 1/4] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap()

From: Ryan Roberts
Date: Mon Jul 17 2023 - 09:22:17 EST


On 17/07/2023 14:19, David Hildenbrand wrote:
> On 17.07.23 15:13, Ryan Roberts wrote:
>> On 17/07/2023 14:00, David Hildenbrand wrote:
>>> On 14.07.23 18:17, Ryan Roberts wrote:
>>>> In preparation for FLEXIBLE_THP support, improve
>>>> folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be
>>>> passed to it. In this case, all contained pages are accounted using the
>>>> order-0 folio (or base page) scheme.
>>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>>>> Reviewed-by: Yu Zhao <yuzhao@xxxxxxxxxx>
>>>> Reviewed-by: Yin Fengwei <fengwei.yin@xxxxxxxxx>
>>>> ---
>>>>    mm/rmap.c | 28 +++++++++++++++++++++-------
>>>>    1 file changed, 21 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>>> index 0c0d8857dfce..f293d072368a 100644
>>>> --- a/mm/rmap.c
>>>> +++ b/mm/rmap.c
>>>> @@ -1278,31 +1278,45 @@ void page_add_anon_rmap(struct page *page, struct
>>>> vm_area_struct *vma,
>>>>     * This means the inc-and-test can be bypassed.
>>>>     * The folio does not have to be locked.
>>>>     *
>>>> - * If the folio is large, it is accounted as a THP.  As the folio
>>>> + * If the folio is pmd-mappable, it is accounted as a THP.  As the folio
>>>>     * is new, it's assumed to be mapped exclusively by a single process.
>>>>     */
>>>>    void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct
>>>> *vma,
>>>>            unsigned long address)
>>>>    {
>>>> -    int nr;
>>>> +    int nr = folio_nr_pages(folio);
>>>>
>>>> -    VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
>>>> +    VM_BUG_ON_VMA(address < vma->vm_start ||
>>>> +            address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
>>>>        __folio_set_swapbacked(folio);
>>>>
>>>> -    if (likely(!folio_test_pmd_mappable(folio))) {
>>>> +    if (!folio_test_large(folio)) {
>>>
>>> Why remove the "likely" here? The patch itself does not change anything about
>>> that condition.
>>
>> Good question; I'm not sure why. Will have to put it down to bad copy/paste
>> fixup. Will put it back in the next version.
>>
>>>
>>>>            /* increment count (starts at -1) */
>>>>            atomic_set(&folio->_mapcount, 0);
>>>> -        nr = 1;
>>>> +        __page_set_anon_rmap(folio, &folio->page, vma, address, 1);
>>>> +    } else if (!folio_test_pmd_mappable(folio)) {
>>>> +        int i;
>>>> +
>>>> +        for (i = 0; i < nr; i++) {
>>>> +            struct page *page = folio_page(folio, i);
>>>> +
>>>> +            /* increment count (starts at -1) */
>>>> +            atomic_set(&page->_mapcount, 0);
>>>> +            __page_set_anon_rmap(folio, page, vma,
>>>> +                    address + (i << PAGE_SHIFT), 1);
>>>> +        }
>>>> +
>>>> +        /* increment count (starts at 0) */
>>>
>>> That comment is a bit misleading. We're not talking about a mapcount as in the
>>> other cases here.
>>
>> Correct, I'm talking about _nr_pages_mapped, which starts 0, not -1 like
>> _mapcount. The comment was intended to be in the style used in other similar
>> places in rmap.c. I could change it to: "_nr_pages_mapped is 0-based, so set it
>> to the number of pages in the folio" or remove it entirely? What do you prefer?
>>
>
> We only have to comment what's weird, not what's normal.
>
> IOW, we also didn't have such a comment in the existing code when doing
> atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED);
>
>
> What might make sense here is a simple
>
> "All pages of the folio are PTE-mapped."
>

ACK - thanks.