Re: [RFC PATCH] mm: support large folio numa balancing

From: Baolin Wang
Date: Tue Nov 14 2023 - 06:11:51 EST

Next message: Stanley Chang[昌育德]: "RE: [PATCH v2] usb: dwc3: add missing of_node_put and platform_device_put"
Previous message: Yin, Fengwei: "Re: [Question]: major faults are still triggered after mlockall when numa balancing"
In reply to: Huang, Ying: "Re: [RFC PATCH] mm: support large folio numa balancing"
Next in thread: Huang, Ying: "Re: [RFC PATCH] mm: support large folio numa balancing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/14/2023 9:12 AM, Huang, Ying wrote:

David Hildenbrand <david@xxxxxxxxxx> writes:

On 13.11.23 11:45, Baolin Wang wrote:

Currently, the file pages already support large folio, and supporting for
anonymous pages is also under discussion[1]. Moreover, the numa balancing
code are converted to use a folio by previous thread[2], and the migrate_pages
function also already supports the large folio migration.
So now I did not see any reason to continue restricting NUMA
balancing for
large folio.

I recall John wanted to look into that. CCing him.

I'll note that the "head page mapcount" heuristic to detect sharers will
now strike on the PTE path and make us believe that a large folios is
exclusive, although it isn't.

Even 4k folio may be shared by multiple processes/threads. So, numa
balancing uses a multi-stage node selection algorithm (mostly
implemented in should_numa_migrate_memory()) to identify shared folios.
I think that the algorithm needs to be adjusted for PTE mapped large
folio for shared folios.

Not sure I get you here. In should_numa_migrate_memory(), it will use last CPU id, last PID and group numa faults to determine if this page can be migrated to the target node. So for large folio, a precise folio sharers check can make the numa faults of a group more accurate, which is enough for should_numa_migrate_memory() to make a decision?

Could you provide a more detailed description of the algorithm you would like to change for large folio? Thanks.

And, as a performance improvement patch, some performance data needs to

Do you have some benchmark recommendation? I know the the autonuma can not support large folio now.

be provided. And, the effect of shared folio detection needs to be
tested too

Next message: Stanley Chang[昌育德]: "RE: [PATCH v2] usb: dwc3: add missing of_node_put and platform_device_put"
Previous message: Yin, Fengwei: "Re: [Question]: major faults are still triggered after mlockall when numa balancing"
In reply to: Huang, Ying: "Re: [RFC PATCH] mm: support large folio numa balancing"
Next in thread: Huang, Ying: "Re: [RFC PATCH] mm: support large folio numa balancing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]