Re: [RFC PATCH 1/3] hugetlb: skip to end of PT page mapping when pte not present

From: Baolin Wang
Date: Wed Jun 15 2022 - 23:48:59 EST




On 6/16/2022 5:22 AM, Mike Kravetz wrote:
On 05/30/22 18:10, Baolin Wang wrote:


On 5/28/2022 6:58 AM, Mike Kravetz wrote:
HugeTLB address ranges are linearly scanned during fork, unmap and
remap operations. If a non-present entry is encountered, the code
currently continues to the next huge page aligned address. However,
a non-present entry implies that the page table page for that entry
is not present. Therefore, the linear scan can skip to the end of
range mapped by the page table page. This can speed operations on
large sparsely populated hugetlb mappings.

Create a new routine hugetlb_mask_last_hp() that will return an
address mask. When the mask is ORed with an address, the result
will be the address of the last huge page mapped by the associated
page table page. Use this mask to update addresses in routines which
linearly scan hugetlb address ranges when a non-present pte is
encountered.

hugetlb_mask_last_hp is related to the implementation of huge_pte_offset
as hugetlb_mask_last_hp is called when huge_pte_offset returns NULL.
This patch only provides a complete hugetlb_mask_last_hp implementation
when CONFIG_ARCH_WANT_GENERAL_HUGETLB is defined. Architectures which
provide their own versions of huge_pte_offset can also provide their own
version of hugetlb_mask_last_hp.

I tested on my ARM64 machine with implementing arm64 specific
hugetlb_mask_last_hp() as below, and it works well.

Just a few nits inline, otherwise looks good to me.
Tested-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Reviewed-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index d93ba128a2b0..e04a097ffcc4 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -376,6 +376,28 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
return NULL;
}

+unsigned long hugetlb_mask_last_hp(struct hstate *h)
+{
+ unsigned long hp_size = huge_page_size(h);
+
+ switch (hp_size) {
+ case P4D_SIZE:
+ return PGDIR_SIZE - P4D_SIZE;
+ case PUD_SIZE:
+ return P4D_SIZE - PUD_SIZE;
+ case CONT_PMD_SIZE:
+ return PUD_SIZE - CONT_PMD_SIZE;
+ case PMD_SIZE:
+ return PUD_SIZE - PMD_SIZE;
+ case CONT_PTE_SIZE:
+ return PMD_SIZE - CONT_PTE_SIZE;
+ default:
+ break;
+ }
+
+ return ~(0UL);
+}

Hello Baolin,

Would you mind sending this as a proper patch with commit message and
'Signed-off-by:'? I would like to include it in the upcoming patch series.

Sure. I've sent it out [1], and please fold it into your series. Thanks.

[1] https://lore.kernel.org/all/7256dbe078d7231f45b0f47c2c52a3bd3aa10da7.1655350193.git.baolin.wang@xxxxxxxxxxxxxxxxx/