Re: kexec reboot failed due to commit 75d090fd167ac

From: Aaron Lu
Date: Tue Aug 29 2023 - 10:06:14 EST


On Tue, Aug 29, 2023 at 03:59:39PM +0300, Kirill A. Shutemov wrote:
> On Tue, Aug 29, 2023 at 08:51:34PM +0800, Aaron Lu wrote:
> > On Tue, Aug 29, 2023 at 07:14:59PM +0700, Bagas Sanjaya wrote:
> > > On Tue, Aug 29, 2023 at 07:48:16PM +0800, Aaron Lu wrote:
> > > > Hi Kirill,
> > > >
> > > > Ever since v6.5-rc1, I found that I can not use kexec to reboot an Intel
> > > > SPR test machine. With git bisect, the first bad commit is 75d090fd167ac
> > > > ("x86/tdx: Add unaccepted memory support").
> > > >
> > > > I have no idea why a tdx change would affect it, I'm not doing anything
> > > > related to tdx.
> > > >
> > > > Any ideas?
>
> Are we talking about bare metal? Or is it kexec in a VM?

Bare metal.

> > > > The kernel config is attached, let me know if you need any other info.
> > >
> > > Can you provide system logs (e.g. journalctl output) when attempting to
> > > reboot?
> >
> > ... ...
> > Aug 29 19:15:59 be3af2b6059f systemd-shutdown[1]: Syncing filesystems and block devices.
> > Aug 29 19:15:59 be3af2b6059f systemd-shutdown[1]: Sending SIGTERM to remaining processes...
> > Aug 29 19:16:00 be3af2b6059f systemd-journald[2629]: Journal stopped
> > -- Boot 7e5173842b8b4be581886ff25ad0c02f --
> > Aug 29 19:24:27 be3af2b6059f kernel: microcode: updated early: 0x2b000161 -> 0x2b000461, date = 2023-03-13
> > Aug 29 19:24:27 be3af2b6059f kernel: Linux version 6.3.8-100.fc37.x86_64 (mockbuild@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx)
> > Aug 29 19:24:27 be3af2b6059f kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.3.8-100.fc37.x86_64 root=UUID=4381321e-e0>
> >
> > First 3 lines are from the first kernel, then I attmpted to kexec reboot
> > to 6.4.0-rc5-00009-g75d090fd167a and remote console hanged with the
> > reboot message of the first kernel. After a while, I know kexec failed
> > so I power cycled the machine to boot into a distro kernel, that is the
> > last 3 lines. There is no trace of the failed boot.
> >
> > I guess the kexeced kernel failed to start early in the boot process
> > so the log is probably only available in serial, if any. Unfortunately,
> > there is no serial support for this machine.
>
> Could you show dmesg of the first kernel before kexec?

Attached.

BTW, kexec is invoked like this:
kver=6.4.0-rc5-00009-g75d090fd167a
kdir=$HOME/kernels/$kver
sudo kexec -l $kdir/vmlinuz-$kver --initrd=$kdir/initramfs-$kver.img --append="root=UUID=4381321e-e01e-455a-9d46-5e8c4c5b2d02 ro net.ifnames=0 acpi_rsdp=0x728e8014 no_hash_pointers sched_verbose selinux=0"

Thanks,
Aaron

Attachment: dmesg_spr.gz
Description: application/gzip