[PATCH v1 2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel

From: Gang Li
Date: Tue Feb 13 2024 - 06:14:48 EST


gather_bootmem_prealloc_node currently only process one list in
huge_boot_pages array. So gather_bootmem_prealloc expects
padata_do_multithreaded to run num_node_state(N_MEMORY) instances of
gather_bootmem_prealloc_node to process all lists in huge_boot_pages.

This works well in current padata_do_multithreaded implementation.
It guarantees that size/min_chunk <= thread num <= max_threads.

```
/* Ensure at least one thread when size < min_chunk. */
nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
nworks = min(nworks, job->max_threads);

ps.nworks = padata_work_alloc_mt(nworks, &ps, &works);
```

However, the comment of padata_do_multithreaded API only promises a
maximum value for the number of threads and does not specify a
minimum value. Which may pass multiple nodes to
gather_bootmem_prealloc_node and only one node will be processed.

To avoid potential errors, introduce gather_bootmem_prealloc_parallel
to handle the case where the number of threads does not meet the
requirement of max_threads.

Fixes: 0306f03dcbd7 ("hugetlb: parallelize 1G hugetlb initialization")
Signed-off-by: Gang Li <ligang.bdlg@xxxxxxxxxxxxx>
---
mm/hugetlb.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 25069ca6ec248..2799a7ea098c1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3414,10 +3414,8 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h,
* Put bootmem huge pages into the standard lists after mem_map is up.
* Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages.
*/
-static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned long end, void *arg)
-
+static void __init gather_bootmem_prealloc_node(unsigned long nid)
{
- int nid = start;
LIST_HEAD(folio_list);
struct huge_bootmem_page *m;
struct hstate *h = NULL, *prev_h = NULL;
@@ -3455,10 +3453,19 @@ static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned lo
prep_and_add_bootmem_folios(h, &folio_list);
}

+static void __init gather_bootmem_prealloc_parallel(unsigned long start,
+ unsigned long end, void *arg)
+{
+ int nid;
+
+ for (nid = start; nid < end; nid++)
+ gather_bootmem_prealloc_node(nid);
+}
+
static void __init gather_bootmem_prealloc(void)
{
struct padata_mt_job job = {
- .thread_fn = gather_bootmem_prealloc_node,
+ .thread_fn = gather_bootmem_prealloc_parallel,
.fn_arg = NULL,
.start = 0,
.size = num_node_state(N_MEMORY),
--
2.20.1