[BISECTED] Boot hangs when SLUB_DEBUG_ON=y

From: Hyeonggon Yoo
Date: Tue Nov 21 2023 - 22:24:03 EST


On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@xxxxxxxxx> wrote:
>
> From: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
>
> Evict alloc/free stack traces from the stack depot for Generic KASAN
> once they are evicted from the quaratine.
>
> For auxiliary stack traces, evict the oldest stack trace once a new one
> is saved (KASAN only keeps references to the last two).
>
> Also evict all saved stack traces on krealloc.
>
> To avoid double-evicting and mis-evicting stack traces (in case KASAN's
> metadata was corrupted), reset KASAN's per-object metadata that stores
> stack depot handles when the object is initialized and when it's evicted
> from the quarantine.
>
> Note that stack_depot_put is no-op if the handle is 0.
>
> Reviewed-by: Marco Elver <elver@xxxxxxxxxx>
> Signed-off-by: Andrey Konovalov <andreyknvl@xxxxxxxxxx>

I observed boot hangs on a few SLUB configurations.

Having other users of stackdepot might be the cause. After passing
'slub_debug=-' which disables SLUB debugging, it boots fine.

compiler version: gcc-11
config: https://download.kerneltesting.org/builds/2023-11-21-f121f2/.config
bisect log: https://download.kerneltesting.org/builds/2023-11-21-f121f2/bisect.log.txt

[dmesg]
(gdb) lx-dmesg
[ 0.000000] Linux version 6.7.0-rc1-00136-g0e8b630f3053
(hyeyoo@localhost.localdomain) (gcc (GCC) 11.3.1 20221121 (R3[
0.000000] Command line: console=ttyS0 root=/dev/sda1 nokaslr
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
6.7.0-rc1-00136-g0e8b630f3053 #22
[ 0.000000] RIP: 0010:setup_arch+0x500/0x2250
[ 0.000000] Code: c6 09 08 00 48 89 c5 48 85 c0 0f 84 58 13 00 00
48 c1 e8 03 48 83 05 be 97 66 00 01 80 3c 18 00 0f3[ 0.000000] RSP:
0000:ffffffff86007e00 EFLAGS: 00010046 ORIG_RAX: 0000000000000009
[ 0.000000] RAX: 1fffffffffe40088 RBX: dffffc0000000000 RCX: 1ffffffff11ed630
[ 0.000000] RDX: 0000000000000000 RSI: feec4698e8103000 RDI: ffffffff88f6b180
[ 0.000000] RBP: ffffffffff200444 R08: 8000000000000163 R09: 1ffffffff11ed628
[ 0.000000] R10: ffffffff88f7a150 R11: 0000000000000000 R12: 0000000000000010
[ 0.000000] R13: ffffffffff200450 R14: feec4698e8102444 R15: feec4698e8102444
[ 0.000000] FS: 0000000000000000(0000) GS:ffffffff88d5b000(0000)
knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.000000] CR2: ffffffffff200444 CR3: 0000000008f0e000 CR4: 00000000000000b0
[ 0.000000] Call Trace:
[ 0.000000] <TASK>
[ 0.000000] ? show_regs+0x87/0xa0
[ 0.000000] ? early_fixup_exception+0x130/0x310
[ 0.000000] ? do_early_exception+0x23/0x90
[ 0.000000] ? early_idt_handler_common+0x2f/0x40
[ 0.000000] ? setup_arch+0x500/0x2250
[ 0.000000] ? __pfx_setup_arch+0x10/0x10
[ 0.000000] ? vprintk_default+0x20/0x30
[ 0.000000] ? vprintk+0x4c/0x80
[ 0.000000] ? _printk+0xba/0xf0
[ 0.000000] ? __pfx__printk+0x10/0x10
[ 0.000000] ? init_cgroup_root+0x10f/0x2f0
--Type <RET> for more, q to quit, c to continue without paging--
[ 0.000000] ? cgroup_init_early+0x1e4/0x440
[ 0.000000] ? start_kernel+0xae/0x790
[ 0.000000] ? x86_64_start_reservations+0x28/0x50
[ 0.000000] ? x86_64_start_kernel+0x10e/0x130
[ 0.000000] ? secondary_startup_64_no_verify+0x178/0x17b
[ 0.000000] </TASK>

--
Hyeonggon