Re: [PATCH v2] mm/migrate: put dest folio on deferred split list if source was there.

From: Zi Yan
Date: Tue Mar 12 2024 - 10:14:41 EST


On 11 Mar 2024, at 23:45, Matthew Wilcox wrote:

> On Mon, Mar 11, 2024 at 03:58:48PM -0400, Zi Yan wrote:
>> @@ -1168,6 +1172,17 @@ static int migrate_folio_unmap(new_folio_t get_new_folio,
>> folio_lock(src);
>> }
>> locked = true;
>> + if (folio_test_large_rmappable(src) &&
>> + !list_empty(&src->_deferred_list)) {
>> + struct deferred_split *ds_queue = get_deferred_split_queue(src);
>> +
>> + spin_lock(&ds_queue->split_queue_lock);
>> + ds_queue->split_queue_len--;
>> + list_del_init(&src->_deferred_list);
>> + spin_unlock(&ds_queue->split_queue_lock);
>> + old_page_state |= PAGE_WAS_ON_DEFERRED_LIST;
>> + }
>
> I have a few problems with this ...
>
> Trivial: your whitespace is utterly broken. You can't use a single tab
> for both indicating control flow change and for line-too-long.

Got it. Will not do this any more.

>
> Slightly more important: You're checking list_empty outside the lock
> (which is fine in order to avoid unnecessarily acquiring the lock),
> but you need to re-check it inside the lock in case of a race. And you
> didn't mark it as data_race(), so KMSAN will whinge.

Got it and will check data_race() related changes. Will fix.

>
> Much more important: You're doing this with a positive refcount, which
> breaks the (undocumented) logic in deferred_split_scan() that a folio
> with a positive refcount will not be removed from the list.

What is the issue here? I thought as long as the split_queue_lock is held,
it should be OK to manipulate the list.

>
> Maximally important: Wer shouldn't be doing any of this! This folio is
> on the deferred split list. We shouldn't be migrating it as a single
> entity; we should be splitting it now that we're in a context where we
> can do the right thing and split it. Documentation/mm/transhuge.rst
> is clear that we don't split it straight away due to locking context.
> Splitting it on migration is clearly the right thing to do.
>
> If splitting fails, we should just fail the migration; splitting fails
> due to excess references, and if the source folio has excess references,
> then migration would fail too.

You are suggesting:
1. checking if the folio is on deferred split list or not
2. if yes, split the folio
3. if split fails, fail the migration as well.

It sounds reasonable to me. The split folios should be migrated since
the before-split folio wants to be migrated. This split is not because
no new page cannot be allocated, thus the split folios should go
into ret_folios list instead of split_folios list.

Thank you for the comments.




--
Best Regards,
Yan, Zi

Attachment: signature.asc
Description: OpenPGP digital signature