Re: [PATCH 3/9] mm/compaction: correctly return failure with bogus compound_order in strict mode

From: Kemeng Shi
Date: Tue Aug 15 2023 - 05:29:25 EST




on 8/15/2023 4:28 PM, Baolin Wang wrote:
>
>
> On 8/5/2023 7:07 PM, Kemeng Shi wrote:
>> In strict mode, we should return 0 if there is any hole in pageblock. If
>> we successfully isolated pages at beginning at pageblock and then have a
>> bogus compound_order outside pageblock in next page. We will abort search
>> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn,
>> we will treat it as a successful isolation in strict mode as blockpfn is
>> not < end_pfn and return partial isolated pages. Then
>> isolate_freepages_range may success unexpectly with hole in isolated
>> range.
>
> Yes, that can be happened.
>
>> This patch also removes unnecessary limit for blockpfn to go outside
>> by buddy page introduced in fixed commit or by stride introduced after
>> fixed commit. Caller could use returned blockpfn to check if full
>> pageblock is scanned by test if blockpfn >= end and to get next pfn to
>> scan inside isolate_freepages_block on demand.
>
> IMO, I don't think removing the pageblock restriction is worth it, since it did not fix anything and will make people more confused, at least to me.
>
> That is to say, it will be surprised that the blockpfn can go outside of the pageblock after calling isolate_freepages_block() to just scan only one pageblock, and I did not see in detail if this can cause other potential problems.
>
>> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner")
>> Signed-off-by: Kemeng Shi <shikemeng@xxxxxxxxxxxxxxx>
>> ---
>>   mm/compaction.c | 21 +++++++++++----------
>>   1 file changed, 11 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index fa1b100b0d10..684f6e6cd8bc 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -631,6 +631,14 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>>                   page += (1UL << order) - 1;
>>                   nr_scanned += (1UL << order) - 1;
>>               }
>> +            /*
>> +             * There is a tiny chance that we have read bogus
>> +             * compound_order(), so be careful to not go outside
>> +             * of the pageblock.
>> +             */
>> +            if (unlikely(blockpfn >= end_pfn))
>> +                blockpfn = end_pfn - 1;
>
> So we can just add this validation to ensure that the isolate_freepages_block() can return 0 if failure is happened, which can fix your problem.
>
Thanks for feedback! Sure, I will do this in next version.
>> +
>>               goto isolate_fail;
>>           }
>>   @@ -677,17 +685,10 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>>       if (locked)
>>           spin_unlock_irqrestore(&cc->zone->lock, flags);
>>   -    /*
>> -     * There is a tiny chance that we have read bogus compound_order(),
>> -     * so be careful to not go outside of the pageblock.
>> -     */
>> -    if (unlikely(blockpfn > end_pfn))
>> -        blockpfn = end_pfn;
>> -
>>       trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
>>                       nr_scanned, total_isolated);
>>   -    /* Record how far we have got within the block */
>> +    /* Record how far we have got */
>>       *start_pfn = blockpfn;
>>         /*
>> @@ -1443,7 +1444,7 @@ fast_isolate_around(struct compact_control *cc, unsigned long pfn)
>>       isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false);
>>         /* Skip this pageblock in the future as it's full or nearly full */
>> -    if (start_pfn == end_pfn && !cc->no_set_skip_hint)
>> +    if (start_pfn >= end_pfn && !cc->no_set_skip_hint)
>>           set_pageblock_skip(page);
>>   }
>>   @@ -1712,7 +1713,7 @@ static void isolate_freepages(struct compact_control *cc)
>>                       block_end_pfn, freelist, stride, false);
>>             /* Update the skip hint if the full pageblock was scanned */
>> -        if (isolate_start_pfn == block_end_pfn)
>> +        if (isolate_start_pfn >= block_end_pfn)
>>               update_pageblock_skip(cc, page, block_start_pfn -
>>                             pageblock_nr_pages);
>>  
>