Re: [PATCH v2 4/8] mm/memory-failure.c: fix race with changing page more robustly

From: Miaohe Lin
Date: Thu Feb 17 2022 - 20:53:11 EST


On 2022/2/18 9:13, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Wed, Feb 16, 2022 at 05:14:27PM +0800, Miaohe Lin wrote:
>> We're only intended to deal with the non-Compound page after we split thp
>> in memory_failure. However, the page could have changed compound pages due
>> to race window. If this happens, we could try again to hopefully handle the
>> page next round. Also remove unneeded orig_head. It's always equal to the
>> hpage. So we can use hpage directly and remove this redundant one.
>>
>> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
>> ---
>> mm/memory-failure.c | 20 ++++++++++++--------
>> 1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 7e205d91b2d7..d66f642888be 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1690,7 +1690,6 @@ int memory_failure(unsigned long pfn, int flags)
>> {
>> struct page *p;
>> struct page *hpage;
>> - struct page *orig_head;
>> struct dev_pagemap *pgmap;
>> int res = 0;
>> unsigned long page_flags;
>> @@ -1736,7 +1735,7 @@ int memory_failure(unsigned long pfn, int flags)
>> goto unlock_mutex;
>> }
>>
>> - orig_head = hpage = compound_head(p);
>> + hpage = compound_head(p);
>> num_poisoned_pages_inc();
>>
>> /*
>> @@ -1817,13 +1816,18 @@ int memory_failure(unsigned long pfn, int flags)
>> lock_page(p);
>>
>> /*
>> - * The page could have changed compound pages during the locking.
>> - * If this happens just bail out.
>> + * We're only intended to deal with the non-Compound page here.
>> + * However, the page could have changed compound pages due to
>> + * race window. If this happens, we could try again to hopefully
>> + * handle the page next round.
>> */
>> - if (PageCompound(p) && compound_head(p) != orig_head) {
>> - action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED);
>> - res = -EBUSY;
>> - goto unlock_page;
>> + if (PageCompound(p)) {
>> + if (TestClearPageHWPoison(p))
>> + num_poisoned_pages_dec();
>> + unlock_page(p);
>> + put_page(p);
>> + flags &= ~MF_COUNT_INCREASED;
>
> Could you limit the retry chance only once by using the local variable
> "retry"? It might be very rare to hit the race more than once in a single
> error event, but just to be safe from potential infinite loop (that could be
> opened by future changes).
>

Sure. Will do it in V3. Thanks.

> Thanks,
> Naoya Horiguchi
>
>> + goto try_again;
>> }
>>
>> /*
>> --
>> 2.23.0