Re: [PATCH v1 0/3] Speed up boot with faster linear map creation

From: Itaru Kitayama
Date: Wed Mar 27 2024 - 07:07:44 EST


On Tue, Mar 26, 2024 at 10:14:45AM +0000, Ryan Roberts wrote:
> Hi All,
>
> It turns out that creating the linear map can take a significant proportion of
> the total boot time, especially when rodata=full. And a large portion of the
> time it takes to create the linear map is issuing TLBIs. This series reworks the
> kernel pgtable generation code to significantly reduce the number of TLBIs. See
> each patch for details.
>
> The below shows the execution time of map_mem() across a couple of different
> systems with different RAM configurations. We measure after applying each patch
> and show the improvement relative to base (v6.9-rc1):
>
> | Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
> | VM, 16G | VM, 64G | VM, 256G | Metal, 512G
> ---------------|-------------|-------------|-------------|-------------
> | ms (%) | ms (%) | ms (%) | ms (%)
> ---------------|-------------|-------------|-------------|-------------
> base | 151 (0%) | 2191 (0%) | 8990 (0%) | 17443 (0%)
> no-cont-remap | 77 (-49%) | 429 (-80%) | 1753 (-80%) | 3796 (-78%)
> no-alloc-remap | 77 (-49%) | 375 (-83%) | 1532 (-83%) | 3366 (-81%)
> lazy-unmap | 63 (-58%) | 330 (-85%) | 1312 (-85%) | 2929 (-83%)
>
> This series applies on top of v6.9-rc1. All mm selftests pass. I haven't yet
> tested all VA size configs (although I don't anticipate any issues); I'll do
> this as part of followup.

The series was applied cleanly on top of v6.9-rc1+ of Linus's master
branch, and boots fine on M1 VM with 14GB of memory.

Just out of curiosity, how did you measure the boot time and obtain the
breakdown of the execution times of each phase?

Tested-by: Itaru Kitayama <itaru.kitayama@xxxxxxxxxxx>

Thanks,
Itaru.

>
> Thanks,
> Ryan
>
>
> Ryan Roberts (3):
> arm64: mm: Don't remap pgtables per- cont(pte|pmd) block
> arm64: mm: Don't remap pgtables for allocate vs populate
> arm64: mm: Lazily clear pte table mappings from fixmap
>
> arch/arm64/include/asm/fixmap.h | 5 +-
> arch/arm64/include/asm/mmu.h | 8 +
> arch/arm64/include/asm/pgtable.h | 4 -
> arch/arm64/kernel/cpufeature.c | 10 +-
> arch/arm64/mm/fixmap.c | 11 +
> arch/arm64/mm/mmu.c | 364 +++++++++++++++++++++++--------
> include/linux/pgtable.h | 8 +
> 7 files changed, 307 insertions(+), 103 deletions(-)
>
> --
> 2.25.1
>