PROBLEM: consolidated IDT invalidation causes kexec to reboot

From: Alexandru Chirvasitu
Date: Sat Dec 23 2017 - 20:43:06 EST


Short description: loading a crash kernel with (a) kexec -l [..] or
(b) kexec -p [..] and then testing it with (a) kexec -e or (b) echo c
> /proc/sysrq-trigger results in a regular reboot (going through BIOS,
etc.).

The commit that starts exhibiting this behaviour for me is

e802a51: x86/idt: Consolidate IDT invalidation

with its parent 8f55868 behaving normally (in scenarios (a) and (b)
alike; (b) produces /proc/vmcore, etc.).

For testing purposes, I've altered machine_kexec_32.c making the
following toy commit. It naively undoes part of e802a51, solely to
confirm that's where it goes awry in my setup.


----------------------------------------------

machine_kexec calls set_idt instead of idt_invalidate for testing purposes

diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c
index 00bc751..70f7d05 100644
--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -26,6 +26,19 @@
#include <asm/set_memory.h>
#include <asm/debugreg.h>

+
+
+static void set_idt(void *newidt, __u16 limit)
+{
+ struct desc_ptr curidt;
+
+ /* ia32 supports unaliged loads & stores */
+ curidt.size = limit;
+ curidt.address = (unsigned long)newidt;
+
+ load_idt(&curidt);
+}
+
static void set_gdt(void *newgdt, __u16 limit)
{
struct desc_ptr curgdt;
@@ -233,7 +246,7 @@ void machine_kexec(struct kimage *image)
* If you want to load them you must set up your own idt & gdt.
*/
set_gdt(phys_to_virt(0), 0);
- idt_invalidate(phys_to_virt(0));
+ set_idt(phys_to_virt(0), 0);

/* now call it */
image->start = relocate_kernel_ptr((unsigned long)image->head

----------------------------------------------

The kernel compiled with these changes restores kexec functionality on
the machine I'm trying it on:

ASUS F5RL Core(TM)2 Duo CPU T5450 @ 1.66GHz

on Debian stable 9.3 32 bit. The loading command I use:

kexec [-l|-p] /boot/dump/vmlinuz-4.14.8-dump --initrd=/boot/dump/initrd.img-4.14.8-dump --append="root=/dev/sda1 1 irqpoll nr_cpus=1 reset_devices"

The nr_cpus=1 is a remnant I left in there; the dump kernel is an
SMP-disabled version of the latest stable one (4.14.8).

Is this expected behaviour?

The issue emerged while reporting a CPU lockup in another email
thread; as this seems different, I figured it wouldn't hurt to send
out a separate message.


Thank you.