[PATCH v3 00/15] Reduce preallocations for maple tree

From: Liam R. Howlett
Date: Mon Jul 24 2023 - 14:32:38 EST


Initial work on preallocations showed no regression in performance
during testing, but recently some users (both on [1] and off [android]
list) have reported that preallocating the worst-case number of nodes
has caused some slow down. This patch set addresses the number of
allocations in a few ways.

During munmap() most munmap() operations will remove a single VMA, so
leverage the fact that the maple tree can place a single pointer at
range 0 - 0 without allocating. This is done by changing the index of
the VMAs to be indexed by the count, starting at 0.

Re-introduce the entry argument to mas_preallocate() so that a more
intelligent guess of the node count can be made.

Implement the more intelligent guess of the node count, although there
is more work to be done.

During development of v2 of this patch set, I also noticed that the
number of nodes being allocated for a rebalance was beyond what could
possibly be needed. This is addressed in patch 0008.

Patches are in the following order:
0001-0002: Testing framework for benchmarking some operations
0003: Reduction of maple node allocation in sidetree
0004: Small cleanup of do_vmi_align_munmap()
0005-0014: mas_preallocate() calculation change
0015: Change the vma iterator order

Changes since v2:
- No longer moving the unmap_vmas() definition - Thanks Mike Kravetz
- Rebase on top of stack changes in do_vmi_align_munmap()

v2: https://lore.kernel.org/linux-mm/20230612203953.2093911-1-Liam.Howlett@xxxxxxxxxx/
v1: https://lore.kernel.org/lkml/20230601021605.2823123-1-Liam.Howlett@xxxxxxxxxx/

Liam R. Howlett (15):
maple_tree: Add benchmarking for mas_for_each
maple_tree: Add benchmarking for mas_prev()
mm: Change do_vmi_align_munmap() tracking of VMAs to remove
mm: Remove prev check from do_vmi_align_munmap()
maple_tree: Introduce __mas_set_range()
mm: Remove re-walk from mmap_region()
maple_tree: Re-introduce entry to mas_preallocate() arguments
maple_tree: Adjust node allocation on mas_rebalance()
mm: Use vma_iter_clear_gfp() in nommu
mm: Set up vma iterator for vma_iter_prealloc() calls
maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
maple_tree: Update mas_preallocate() testing
maple_tree: Refine mas_preallocate() node calculations
maple_tree: Reduce resets during store setup
mm/mmap: Change vma iteration order in do_vmi_align_munmap()

fs/exec.c | 1 +
include/linux/maple_tree.h | 23 ++++-
include/linux/mm.h | 4 +-
lib/maple_tree.c | 117 +++++++++++++++------
lib/test_maple_tree.c | 76 ++++++++++++++
mm/internal.h | 36 +++++--
mm/memory.c | 16 ++-
mm/mmap.c | 170 +++++++++++++++++--------------
mm/nommu.c | 45 ++++----
tools/testing/radix-tree/maple.c | 59 ++++++-----
10 files changed, 359 insertions(+), 188 deletions(-)

--
2.39.2