Re: [PATCH v12 3/7] crash: add generic infrastructure for crash hotplug support

From: Eric DeVolder
Date: Fri Oct 07 2022 - 15:21:58 EST




On 10/4/22 01:38, Sourabh Jain wrote:

On 10/09/22 02:35, Eric DeVolder wrote:
CPU and memory change notifications are received in order to
regenerate the elfcorehdr.

To support cpu hotplug, a callback is registered to capture the
CPUHP_AP_ONLINE_DYN online and offline events via
cpuhp_setup_state_nocalls().

To support memory hotplug, a notifier is registered to capture the
MEM_ONLINE and MEM_OFFLINE events via register_memory_notifier().

The cpu callback and memory notifiers call handle_hotplug_event()
which performs needed tasks and then dispatches the event to the
architecture specific arch_crash_handle_hotplug_event(). During the
process, the kexec_mutex is held.

Signed-off-by: Eric DeVolder <eric.devolder@xxxxxxxxxx>
Acked-by: Baoquan He <bhe@xxxxxxxxxx>
---
  include/linux/crash_core.h |   8 +++
  include/linux/kexec.h      |  26 +++++++
  kernel/crash_core.c        | 134 +++++++++++++++++++++++++++++++++++++
  3 files changed, 168 insertions(+)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index de62a722431e..a270f8660538 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -84,4 +84,12 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
  int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
          unsigned long long *crash_size, unsigned long long *crash_base);
+#define KEXEC_CRASH_HP_REMOVE_CPU        0
+#define KEXEC_CRASH_HP_ADD_CPU            1
+#define KEXEC_CRASH_HP_REMOVE_MEMORY        2
+#define KEXEC_CRASH_HP_ADD_MEMORY        3
+#define KEXEC_CRASH_HP_INVALID_CPU        -1U
+
+struct kimage;
+
  #endif /* LINUX_CRASH_CORE_H */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 4eefa631e0ae..9597b41136ec 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -374,6 +374,13 @@ struct kimage {
      struct purgatory_info purgatory_info;
  #endif
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+    bool hotplug_event;
+    unsigned int offlinecpu;
+    bool elfcorehdr_index_valid;
+    int elfcorehdr_index;

Do we really need elfcorehdr_index_valid to decide elfcorehdr_index holds a valid index?
No, as you point out you can overload the index value itself.
(In fact I originally went this route but encountered trouble
with locating the proper locations to place the initialization code).

However, the current approach has the advantage that it is
automatically zero'd and thus set to its correct (false) setting
immediatley upon kexec load without any additional code. As the
diff you have below indicates, there are several sites that need
to set the index to its false (-1) value to accomplish the same.

I prefer the index_valid approach, but if there is strong support
for overloading the index, then it can be changed.

eric



How about initializing elfcorehdr_index to a negative number while loading kdump kernel (or kexec kernel if needed)
for both kexec_load and kexec_file_load case and consider that as invalid index to find the correct one.

Some thing like this:

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 5bc5159d9cb1..0cccdb2f7f26 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -656,7 +656,7 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
                 * segment containing the elfcorehdr, if not already found.
                 * This works for both the kexec_load and kexec_file_load paths.
                 */
-               if (!image->elfcorehdr_index_valid) {
+               if (image->elfcorehdr_index < 0) {
                        unsigned char *ptr;
                        unsigned long mem, memsz;
                        unsigned int n;
diff --git a/kernel/kexec.c b/kernel/kexec.c
index b5e40f069768..ed1c6a88879b 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -156,6 +156,10 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments,
        if (ret)
                goto out;

+       /* Below check is not necessary */
+       if (flags & KEXEC_FILE_ON_CRASH)
+               image->elfcorehdr_index = -1;
+
        /* Install the new kernel and uninstall the old */
        image = xchg(dest_image, image);

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index d0c2661b3509..535dbc26930a 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -400,6 +400,10 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
        if (ret)
                goto out;

+       /* Below check is not necessary */
+       if (flags & KEXEC_FILE_ON_CRASH)
+               image->elfcorehdr_index = -1;
+
        /*
         * Free up any temporary buffers allocated which are not needed
         * after image has been loaded

Thanks,
Sourabh Jain