Re: [PATCH v3 4/5] s390/mm: implement MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers

From: Sumanth Korikkar
Date: Mon Nov 27 2023 - 11:12:55 EST


On Mon, Nov 27, 2023 at 04:11:05PM +0100, David Hildenbrand wrote:
> > diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c
> > index 355e63e44e95..30b829e4c052 100644
> > --- a/drivers/s390/char/sclp_cmd.c
> > +++ b/drivers/s390/char/sclp_cmd.c
> > @@ -18,6 +18,7 @@
> > #include <linux/mm.h>
> > #include <linux/mmzone.h>
> > #include <linux/memory.h>
> > +#include <linux/memory_hotplug.h>
> > #include <linux/module.h>
> > #include <asm/ctlreg.h>
> > #include <asm/chpid.h>
> > @@ -26,6 +27,7 @@
> > #include <asm/sclp.h>
> > #include <asm/numa.h>
> > #include <asm/facility.h>
> > +#include <asm/page-states.h>
> > #include "sclp.h"
> > @@ -319,6 +321,7 @@ static bool contains_standby_increment(unsigned long start, unsigned long end)
> > static int sclp_mem_notifier(struct notifier_block *nb,
> > unsigned long action, void *data)
> > {
> > + struct memory_block *memory_block;
> > unsigned long start, size;
> > struct memory_notify *arg;
> > unsigned char id;
> > @@ -340,18 +343,29 @@ static int sclp_mem_notifier(struct notifier_block *nb,
> > if (contains_standby_increment(start, start + size))
> > rc = -EPERM;
> > break;
> > - case MEM_GOING_ONLINE:
> > + case MEM_PREPARE_ONLINE:
> > + memory_block = find_memory_block(pfn_to_section_nr(arg->start_pfn));
> > + if (!memory_block) {
> > + rc = -EINVAL;
> > + goto out;
> > + }
> > rc = sclp_mem_change_state(start, size, 1);
> > + if (rc || !memory_block->altmap)
> > + goto out;
> > + /*
> > + * Set CMMA state to nodat here, since the struct page memory
> > + * at the beginning of the memory block will not go through the
> > + * buddy allocator later.
> > + */
> > + __arch_set_page_nodat((void *)__va(start), memory_block->altmap->free);
>
> Looking up the memory block and grabbing the altmap from there is a bit
> unfortunate.
>
> Why can't we do that when adding the altmap? Will the hypervisor scream at
> us?
>
calling __arch_set_page_nodat() before making memory block accessible
will lead to crash. Hence, we think this is the only safe location to
place it.

> ... would we want to communicate any altmap start+size via the memory
> notifier instead?

Passing start, size of memory range via memory notifier looks correct
approach to me, as we try to make the specified range accessible.

If we want to pass altmap size (nr_vmemmap_pages), then we might need a
new field in struct memory_notify, which would prevent access of
memory_block->altmap->free in the notifier.

Do you want to take this approach instead?

If yes, Then I could add a new field nr_vmemmap_pages in struct
memory_notify and place it in PATCH : introduce
MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers.


Thanks