Re: [PATCH v2] mm: prohibit the last subpage from reusing the entire large folio

From: David Hildenbrand
Date: Fri Mar 08 2024 - 04:34:34 EST


On 08.03.24 10:27, Barry Song wrote:
From: Barry Song <v-songbaohua@xxxxxxxx>

In a Copy-on-Write (CoW) scenario, the last subpage will reuse the entire
large folio, resulting in the waste of (nr_pages - 1) pages. This wasted
memory remains allocated until it is either unmapped or memory
reclamation occurs.

The following small program can serve as evidence of this behavior

main()
{
#define SIZE 1024 * 1024 * 1024UL
void *p = malloc(SIZE);
memset(p, 0x11, SIZE);
if (fork() == 0)
_exit(0);
memset(p, 0x12, SIZE);
printf("done\n");
while(1);
}

For example, using a 1024KiB mTHP by:
echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled

(1) w/o the patch, it takes 2GiB,

Before running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 84 5692 0 17 5669
Swap: 0 0 0

/ # /a.out &
/ # done

After running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 2149 3627 0 19 3605
Swap: 0 0 0

(2) w/ the patch, it takes 1GiB only,

Before running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 89 5687 0 17 5664
Swap: 0 0 0

/ # /a.out &
/ # done

After running the test program,
/ # free -m
total used free shared buff/cache available
Mem: 5754 1122 4655 0 17 4632
Swap: 0 0 0

This patch migrates the last subpage to a small folio and immediately
returns the large folio to the system. It benefits both memory availability
and anti-fragmentation.

It might be controversial optimization, and as Ryan said, there, are likely other cases where we'd want to migrate off-of a thp if possible earlier.

But I like that it just handles large folios now in a consistent way for the time being.

Acked-by: David Hildenbrand <david@xxxxxxxxxx>

--
Cheers,

David / dhildenb