Re: [RFC PATCH 0/2] x86: kprobes: Fix CFI_CLANG related issues

From: Nathan Chancellor
Date: Tue Jul 11 2023 - 14:37:13 EST


Masami, thanks for verifying!

Hi Greg and Sasha,

On Tue, Jul 11, 2023 at 10:33:03AM +0900, Masami Hiramatsu wrote:
> On Mon, 10 Jul 2023 08:57:03 -0700
> Nathan Chancellor <nathan@xxxxxxxxxx> wrote:
>
> > On Mon, Jul 10, 2023 at 09:14:13PM +0900, Masami Hiramatsu (Google) wrote:
> > > I just build tested, since I could not boot the kernel with CFI_CLANG=y.
> > > Would anyone know something about this error?
> > >
> > > [ 0.141030] MMIO Stale Data: Unknown: No mitigations
> > > [ 0.153511] SMP alternatives: Using kCFI
> > > [ 0.164593] Freeing SMP alternatives memory: 36K
> > > [ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
> > > [ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
> > > [ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > [ 0.166028] Call Trace:
> > > [ 0.166028] <TASK>
> > > [ 0.166028] dump_stack_lvl+0x6e/0xb0
> > > [ 0.166028] panic+0x146/0x2f0
> > > [ 0.166028] ? start_kernel+0x472/0x48b
> > > [ 0.166028] __stack_chk_fail+0x14/0x20
> > > [ 0.166028] start_kernel+0x472/0x48b
> > > [ 0.166028] x86_64_start_reservations+0x24/0x30
> > > [ 0.166028] x86_64_start_kernel+0xa6/0xbb
> > > [ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
> > > [ 0.166028] </TASK>
> > > [ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
> >
> > This looks like https://github.com/ClangBuiltLinux/linux/issues/1815 to
> > me. What version of LLVM are you using? This was fixed in 16.0.4. Commit
> > 514ca14ed544 ("start_kernel: Add __no_stack_protector function
> > attribute") should resolve it on the Linux side, it looks like that is
> > in 6.5-rc1. Not sure if we should backport it or just let people upgrade
> > their toolchains on older releases.
>
> Thanks for the info. I confirmed that the commit fixed the boot issue.
> So I think it should be backported to the stable tree.

Would you please apply commit 514ca14ed544 ("start_kernel: Add
__no_stack_protector function attribute") to linux-6.4.y? The series
ending with commit 611d4c716db0 ("x86/hyperv: Mark hv_ghcb_terminate()
as noreturn") that shipped in 6.4 exposes an LLVM issue that affected
16.0.0 and 16.0.1, which was resolved in 16.0.2. When using those
affected LLVM releases, the following crash at boot occurs:

[ 0.181667] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x3cf/0x3d0
[ 0.182621] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.3 #1
[ 0.182621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[ 0.182621] Call Trace:
[ 0.182621] <TASK>
[ 0.182621] dump_stack_lvl+0x6a/0xa0
[ 0.182621] panic+0x124/0x2f0
[ 0.182621] ? start_kernel+0x3cf/0x3d0
[ 0.182621] ? acpi_enable+0x64/0xc0
[ 0.182621] __stack_chk_fail+0x14/0x20
[ 0.182621] start_kernel+0x3cf/0x3d0
[ 0.182621] x86_64_start_reservations+0x24/0x30
[ 0.182621] x86_64_start_kernel+0xab/0xb0
[ 0.182621] secondary_startup_64_no_verify+0x107/0x10b
[ 0.182621] </TASK>
[ 0.182621] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x3cf/0x3d0 ]---

514ca14ed544 aims to avoid this on the Linux side. I have verified that
it applies to 6.4.3 cleanly and resolves the issue there, as has Masami.

If there are any issues or questions, please let me know.

Cheers,
Nathan