Re: [PATCH] x86/mm/ident_map: Use gbpages only where full GB page should be mapped.

From: Dave Hansen
Date: Thu Nov 02 2023 - 12:03:46 EST


On 10/31/23 12:50, Steve Wahl wrote:
> Instead of using gbpages for all memory regions, use them only when
> map creation requests include the full GB page of space; descend to
> using smaller 2M pages when only portions of a GB page are requested.
...

The logic here is sound: we obviously don't want systems rebooting
because speculation went wonky.

> diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
> index 968d7005f4a7..b63a1ffcfe9f 100644
> --- a/arch/x86/mm/ident_map.c
> +++ b/arch/x86/mm/ident_map.c
> @@ -31,18 +31,26 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
> if (next > end)
> next = end;
>
> - if (info->direct_gbpages) {
> + /*
> + * if gbpages allowed, this entry not yet present, and
> + * the full gbpage range is requested (both ends are
> + * correctly aligned), create a gbpage.
> + */
> + if (info->direct_gbpages
> + && !pud_present(*pud)
> + && !(addr & ~PUD_MASK)
> + && !(next & ~PUD_MASK)) {

This is a _bit_ too subtle for my taste.

Let's say someone asks for mapping of 2MB@5G, then later asks for 1G@5G.
The first call in here will do the 2MB mapping with small (pud)
entries. The second call will see the new pud_present() check and
*also* skip large pud creation.

That's a regression. It might not matter _much_, but it's a place where
the old code creates large PUDs and the new code creates small ones.
It's the kind of behavior change that at least needs to be noted in the
changelog.

Off the top of my head, I can't think of why we'd get overlapping
requests in here, though. Did you think through that case? Is it common?

> pud_t pudval;
>
> - if (pud_present(*pud))
> - continue;
> -
> - addr &= PUD_MASK;
> pudval = __pud((addr - info->offset) | info->page_flag);
> set_pud(pud, pudval);
> continue;
> }
>
> + /* if this is already a gbpage, this portion is already mapped */
> + if (pud_large(*pud))
> + continue;

I'd probably move this check up to above the large PUD creation code.
It would make it more obvious that any PUD that's encountered down below
is a small PUD.

> if (pud_present(*pud)) {
> pmd = pmd_offset(pud, 0);
> ident_pmd_init(info, pmd, addr, next);

That would end up looking something like this:

bool do_gbpages = true;
...

// Is the whole range already mapped?
if (pud_large(*pud))
continue;

/* PUD is either empty or small */

// GB pages allowed?
do_gbpages &= info->direct_gbpages;
// Addresses aligned for GB pages?
do_gbpages &= ~(addr & ~PUD_MASK);
do_gbpages &= ~(next & ~PUD_MASK);
// Check for existing mapping using a small PUD
do_gbpages &= !pud_present(*pud);

if (do_gbpages) {
set_pud(pud, __pud((addr - info->offset) |
info->page_flag));
continue
}