Re: [PATCH v12 7/7] x86/crash: Add x86 crash hotplug support

From: Eric DeVolder
Date: Fri Sep 30 2022 - 13:12:13 EST




On 9/30/22 11:50, Borislav Petkov wrote:
On Fri, Sep 30, 2022 at 10:36:49AM -0500, Eric DeVolder wrote:
Your help text talks about System RAM entries in /proc/iomem which means
that those entries are present somewhere in the kernel and you can read
them out and do the proper calculations dynamically instead of doing the
static CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES thing.

The intent is to compute the max size buffer needed to contain a maximum
populated elfcorehdr, which is primarily based on the number of CPUs and
memory regions. Thus far I (and others involved) have not found a kernel
method to determine the maximum number of memory regions possible (if you
are aware of one, please let me know!). Thus CONFIG_CRASH_MAX_MEMORY_RANGES
was born (rather borrowed from kexec-tools).

Let's ask some mm folks.

mm folks, is there a way to enumerate all the memory regions a machine
has?

It looks to me like register_memory_resource() in mm/memory_hotplug.c
does register the resource so there should be a way to count that list
of resources or at least maintain a count somewhere so that kexec/crash
code can know how big its elfcodehdr buffer should be instead of doing a
clumsy Kconfig item where people would need to guess...

Hmm.


There is of course a way to enumerate the memory regions in use on the machine, that is not what this code needs. In order to compute the maximum buffer size needed (this buffer size is computed once), the count of the maximum number of memory regions possible (even if not currently in use) is what is needed.

+#ifdef CONFIG_CRASH_MAX_MEMORY_RANGES
So I think the use of CONFIG_CRASH_MAX_MEMORY_RANGES is not correct; it
still needs to be based on the cpu or memory hotplug options.

You're kidding, right?

+config CRASH_MAX_MEMORY_RANGES
+ depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG)
^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Oh, that would be an error of haste on my part. This should be:
depends on CRASH_DUMP && MEMORY_HOTPLUG


@@ -622,6 +622,15 @@ static int __init crash_save_vmcoreinfo_init(void)
subsys_initcall(crash_save_vmcoreinfo_init);
#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+
+void __weak *arch_map_crash_pages(unsigned long paddr, unsigned long size)
+{
+ return NULL;
+}
+
+void __weak arch_unmap_crash_pages(void **ptr) { }
+void __weak arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action) { }
+
I was asked by Baoquan He to eliminate the use of __weak

Because?


Baoquan pointed me to:

https://lore.kernel.org/lkml/cover.1656659357.git.naveen.n.rao@xxxxxxxxxxxxxxxxxx/T/

Thanks,
eric