[PATCH v6 0/5] implement "memmap on memory" feature on s390

From: Sumanth Korikkar
Date: Mon Jan 08 2024 - 08:28:34 EST


Hi All,

This series provides "memmap on memory" support on s390 platform.
"memmap on memory" allows struct pages array to be allocated from the
hotplugged memory range instead of allocating it from main system
memory.

s390 currently preallocates struct pages array for all potentially
possible memory, which ensures memory onlining always succeeds, but with
the cost of significant memory consumption from the available system
memory during boottime. In certain extreme configuration, this could
lead to ipl failure.

"memmap on memory" ensures struct pages array are populated from self
contained hotplugged memory range instead of depleting the available
system memory and this could eliminate ipl failure on s390 platform.

On other platforms, system might go OOM when the physically hotplugged
memory depletes the available memory before it is onlined. Hence,
"memmap on memory" feature was introduced as described in commit
a08a2ae34613 ("mm,memory_hotplug: allocate memmap from the added memory
range").

Unlike other architectures, s390 memory blocks are not physically accessible
until it is online. To make it physically accessible two new memory
notifiers MEM_PREPARE_ONLINE / MEM_FINISH_OFFLINE are added and this
notifier lets the hypervisor inform that the memory should be made
physically accessible. This allows for "memmap on memory" initialization
during memory hotplug onlining phase, which is performed before calling
MEM_GOING_ONLINE notifier.

Patch 1 introduces MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE memory
notifiers to prepare the transition of memory to and from a physically
accessible state. New mhp_flag MHP_OFFLINE_INACCESSIBLE is introduced to
ensure altmap cannot be written when adding memory - before it is set
online. This enhancement is crucial for implementing the "memmap on
memory" feature for s390 in a subsequent patch.

Patches 2 allocates vmemmap pages from self-contained memory range for
s390. It allocates memory map (struct pages array) from the hotplugged
memory range, rather than using system memory by passing altmap to
vmemmap functions.

Patch 3 removes unhandled memory notifier types on s390.

Patch 4 implements MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE memory
notifiers on s390. MEM_PREPARE_ONLINE memory notifier makes memory block
physical accessible via sclp assign command. The notifier ensures
self-contained memory maps are accessible and hence enabling the "memmap
on memory" on s390. MEM_FINISH_OFFLINE memory notifier shifts the memory
block to an inaccessible state via sclp unassign command.

Patch 5 finally enables MHP_MEMMAP_ON_MEMORY on s390.

v6:
* Added usecase description in cover letter.
* Rebased against mm branch. Added mhp_flag parameter to
create_altmaps_and_memory_blocks() inorder to rebase patch1.

v5:
* Added reviewed-by
* Removed variables altmap_start, altmap_size in sclp_cmd.c
* Used PFN_PHYS macro.

Thanks for the valualble feedback.

v4:
* Introduced two new fields, altmap_start_pfn and altmap_nr_pages, in
the memory_notify structure and document it that it is used only in
MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE callbacks.
* Incorporated the newly added fields into s390's
MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifier callbacks.
* Prevent access to memblock->altmap->free in the s390 notifier callback.
* page_init_poison() could be performed similar to when adding new
memory in sparse_add_section(). Perform it without cond_resched().

v3:
* added comments to MHP_OFFLINE_ACCESSIBLE as suggested by David.
* Squashed three commits related to new memory notifier.

v2:
* Fixes are integrated and hence removed from this patch series
Suggestions from David:
* Add new flag MHP_OFFLINE_INACCESSIBLE to avoid accessing memory
during memory hotplug addition phase.
* Avoid page_init_poison() on memmap during mhp addition phase, when
MHP_OFFLINE_INACCESSIBLE mhp_flag is passed in add_memory().
* Do not skip add_pages() in arch_add_memory(). Similarly, remove
similar hacks in arch_remove_memory().
* Use MHP_PREPARE_ONLINE/MHP_FINISH_OFFLINE naming convention for
new memory notifiers.
* Rearrange removal of unused s390 memory notifier.
* Necessary commit messages changes.

Thank you

Sumanth Korikkar (5):
mm/memory_hotplug: introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE
notifiers
s390/mm: allocate vmemmap pages from self-contained memory range
s390/sclp: remove unhandled memory notifier type
s390/mm: implement MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers
s390: enable MHP_MEMMAP_ON_MEMORY

arch/s390/Kconfig | 1 +
arch/s390/mm/init.c | 3 --
arch/s390/mm/vmem.c | 62 +++++++++++++++++++---------------
drivers/base/memory.c | 23 ++++++++++++-
drivers/s390/char/sclp_cmd.c | 44 +++++++++++++++++++-----
include/linux/memory.h | 9 +++++
include/linux/memory_hotplug.h | 18 +++++++++-
include/linux/memremap.h | 1 +
mm/memory_hotplug.c | 17 ++++++++--
mm/sparse.c | 3 +-
10 files changed, 136 insertions(+), 45 deletions(-)

--
2.40.1