Re: [PATCH 1/2] mm: gup: Re-pin pages in case of trying several times to migrate

From: Baolin Wang
Date: Thu Oct 20 2022 - 05:24:25 EST




On 10/20/2022 4:15 PM, Huang, Ying wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:

The migrate_pages() will return the number of {normal page, THP, hugetlb}
that were not migrated, or an error code. That means it can still return
the number of failure count, though the pages have been migrated
successfully with several times re-try.

If my understanding were correct, if pages are migrated successfully
after several times re-tries, the return value will be 0. There's one
possibility for migrate_pages() to return non-zero but all pages are
migrated. That is, when THP is split and all subpages are migrated
successfully.

Yeah, that's the case I tested. Thanks for pointing out. I'll re-write my incorrect commit message next time.


So we should not use the return value of migrate_pages() to determin
if there are pages are failed to migrate. Instead we can validate the
'movable_page_list' to see if there are pages remained in the list,
which are failed to migrate. That can mitigate the failure of longterm
pinning.

Another choice is to use a special return value for split THP + success
migration. But I'm fine to use list_empty(return_pages).

OK. Using list_empty(return_pages) looks more simple.


Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
---
mm/gup.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 5182aba..bd8cfcd 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1914,9 +1914,10 @@ static int migrate_longterm_unpinnable_pages(
.gfp_mask = GFP_USER | __GFP_NOWARN,
};
- if (migrate_pages(movable_page_list, alloc_migration_target,
- NULL, (unsigned long)&mtc, MIGRATE_SYNC,
- MR_LONGTERM_PIN, NULL)) {
+ ret = migrate_pages(movable_page_list, alloc_migration_target,
+ NULL, (unsigned long)&mtc, MIGRATE_SYNC,
+ MR_LONGTERM_PIN, NULL);
+ if (ret < 0 || !list_empty(movable_page_list)) {

It seems that !list_empty() is sufficient here.

OK. Drop the 'ret < 0'

ret = -ENOMEM;

Why change the error code? I don't think it's a good idea to do that.

The GUP need a -errno for failure or partial success when migration, and we can not return the number of pages failed to migrate. So returning -ENOMEM seems suitable for both cases?