Kernel null pointer dereference on stopping raid device

From: Jain, Ayush
Date: Tue Jun 13 2023 - 16:13:22 EST


Hello All,

On next-20230613 release after creation of raid devices while stopping
the same hitting kernel NULL pointer dereference situation on
AMD x86 systems.

Kernel: 6.4.0-rc6-next-20230613
Commit: 1f6ce8392d6ff48

$ mdadm --create --assume-clean /dev/md/mdsraid --level=0 --raid-devices=1 /dev/loop0 --metadata=1.2 --verbose --force
$ mdadm --stop /dev/md/mdsraid


Attaching Kernel trace below
[ 32.260763] PEFILE: Unsigned PE binary
[ 117.236671] block device autoloading is deprecated and will be removed.
[ 117.262329] md127: detected capacity change from 0 to 25581568
[ 180.249007] md127: detected capacity change from 25581568 to 0
[ 180.255540] md: md127 stopped.
[ 180.268433] BUG: kernel NULL pointer dereference, address: 00000000000000a4
[ 180.276210] #PF: supervisor read access in kernel mode
[ 180.281947] #PF: error_code(0x0000) - not-present page
[ 180.287676] PGD 0 P4D 0
[ 180.290508] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 180.295374] CPU: 5 PID: 7674 Comm: mdadm Kdump: loaded Not tainted 6.4.0-rc6-next-20230613 #1
[ 180.315092] RIP: 0010:export_rdev+0xb2/0x1f0
[ 180.319869] Code: c7 43 40 00 00 00 00 48 8d bb 48 01 00 00 e8 c5 c0 c5 ff 48 8b 83 b8 00 00 00 a8 10 74 0c 48 8b 43 30 8b 78 34 e8 ae fe ff ff <83> bd a4 00 00 00 fe 48 c7 c6 c0 f9 aa 9d 48 8b 7b 30 48 0f 45 f3
[ 180.340820] RSP: 0018:ffffb1dadc677da0 EFLAGS: 00010246
[ 180.346655] RAX: 0000000000000002 RBX: ffff9ca944130e00 RCX: 0000000080080007
[ 180.354622] RDX: 0000000080080008 RSI: fffffc7fc20f2c00 RDI: 0000000000000000
[ 180.362588] RBP: 0000000000000000 R08: ffff9d0943cb0000 R09: 0000000080080007
[ 180.370553] R10: 0000000040000000 R11: 0000000000000001 R12: 0000000000000000
[ 180.378512] R13: 0000000000000000 R14: ffff9d0943cb21d8 R15: ffff9ca94307c400
[ 180.386470] FS: 00007f2a63448740(0000) GS:ffff9ca8fef40000(0000) knlGS:0000000000000000
[ 180.395502] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 180.401917] CR2: 00000000000000a4 CR3: 0000000102fcc000 CR4: 00000000003506e0
[ 180.409875] Call Trace:
[ 180.412608] <TASK>
[ 180.414957] ? __die+0x24/0x70
[ 180.418372] ? page_fault_oops+0x82/0x150
[ 180.422852] ? exc_page_fault+0x69/0x150
[ 180.427237] ? asm_exc_page_fault+0x26/0x30
[ 180.431916] ? export_rdev+0xb2/0x1f0
[ 180.436005] ? md_kick_rdev_from_array+0x118/0x150
[ 180.441354] do_md_stop+0x28e/0x580
[ 180.445241] ? security_capable+0x3a/0x60
[ 180.449721] md_ioctl+0x540/0x940
[ 180.453423] ? selinux_bprm_creds_for_exec+0x291/0x2a0
[ 180.459163] blkdev_ioctl+0x142/0x280
[ 180.463255] __x64_sys_ioctl+0x91/0xd0
[ 180.467447] do_syscall_64+0x3f/0x90
[ 180.471440] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 180.477081] RIP: 0033:0x7f2a6323ec6b
[ 180.481073] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
[ 180.502032] RSP: 002b:00007ffc29d52238 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 180.510484] RAX: ffffffffffffffda RBX: 0000000000000019 RCX: 00007f2a6323ec6b
[ 180.518449] RDX: 0000000000000000 RSI: 0000000000000932 RDI: 0000000000000003
[ 180.526415] RBP: 0000000000000003 R08: 0000000000000207 R09: 00007ffc29d51eb5
[ 180.534373] R10: 000000000000007f R11: 0000000000000246 R12: 0000555c79876280
[ 180.542338] R13: 00007ffc29d55379 R14: 00007ffc29d52330 R15: 00007ffc29d523d0
[ 180.550305] </TASK>

Thanks & Regards,
Ayush Jain