[Syzkaller & bisect] There is WARNING in shmem_writepage in v6.7

From: Pengfei Xu
Date: Wed Jan 10 2024 - 21:21:21 EST


Hi Luis Chamberlain,

There is WARNING in shmem_writepage in v6.7 in guest.

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240110_151928_shmem_writepage
Syzkaller syscall reproduced steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240110_151928_shmem_writepage/repro.prog
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/240110_151928_shmem_writepage/repro.c
Binary: https://github.com/xupengfe/syzkaller_logs/raw/main/240110_151928_shmem_writepage/repro
Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240110_151928_shmem_writepage/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240110_151928_shmem_writepage/bisect_info.log
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240110_151928_shmem_writepage/0dd3ee31125508cd67f7e7172247f05b7fd1753a_dmesg.log
v6.7 bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240110_151928_shmem_writepage/bzImage_v6.7.tar.gz

Bisected and found the suspected commit:
9a976f0c847b shmem: skip page split if we're not reclaiming

"
[ 31.541851] ------------[ cut here ]------------
[ 31.542523] WARNING: CPU: 0 PID: 952 at mm/shmem.c:1438 shmem_writepage+0x28d/0x10f0
[ 31.543355] Modules linked in:
[ 31.543711] CPU: 0 PID: 952 Comm: repro Not tainted 6.7.0-0dd3ee311255+ #1
[ 31.544455] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 31.546342] RIP: 0010:shmem_writepage+0x28d/0x10f0
[ 31.546892] Code: 31 ff 0f b6 80 c5 00 00 00 89 c6 88 85 28 ff ff ff e8 a7 55 bd ff 0f b6 85 28 ff ff ff 84 c0 0f 84 7a 01 00 00 e8 e3 5a bd ff <0f> 0b e8 dc 5a bd ff 4c 89 e7 e8 b4 67 fa ff 4c 89 f2 48 b8 00 00
[ 31.548893] RSP: 0018:ffff88801aa27040 EFLAGS: 00010293
[ 31.549805] RAX: 0000000000000000 RBX: ffffea00008eeac0 RCX: ffffffff81a433b6
[ 31.550583] RDX: ffff888023d84a00 RSI: ffffffff81a4342d RDI: 0000000000000007
[ 31.551352] RBP: ffff88801aa27140 R08: ffff8880118a6b28 R09: fffff9400011dd58
[ 31.552123] R10: 0000000000002000 R11: 0000000000000001 R12: ffffea00008eeac0
[ 31.552891] R13: ffff8880118a67e8 R14: ffff88801aa271bc R15: 0000000000000008
[ 31.553710] FS: 00007f47e5bb7640(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
[ 31.554577] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 31.555205] CR2: 00007f47e584e1f0 CR3: 000000000f772003 CR4: 0000000000770ef0
[ 31.556039] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 31.556835] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 31.557656] PKRU: 55555554
[ 31.557969] Call Trace:
[ 31.558267] <TASK>
[ 31.558536] ? show_regs+0xa9/0xc0
[ 31.558946] ? __warn+0xef/0x340
[ 31.559325] ? report_bug+0x25e/0x4b0
[ 31.559766] ? shmem_writepage+0x28d/0x10f0
[ 31.560251] ? report_bug+0x2cb/0x4b0
[ 31.560674] ? shmem_writepage+0x28d/0x10f0
[ 31.561188] ? handle_bug+0xa2/0x130
[ 31.561629] ? exc_invalid_op+0x3c/0x80
[ 31.562081] ? asm_exc_invalid_op+0x1f/0x30
[ 31.562576] ? shmem_writepage+0x216/0x10f0
[ 31.563052] ? shmem_writepage+0x28d/0x10f0
[ 31.563530] ? shmem_writepage+0x28d/0x10f0
[ 31.564065] ? __pfx_shmem_writepage+0x10/0x10
[ 31.564584] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[ 31.565223] ? __kasan_check_write+0x18/0x20
[ 31.565727] ? folio_clear_dirty_for_io+0xc1/0x600
[ 31.566280] pageout+0x3aa/0x900
[ 31.566664] ? __pfx_pageout+0x10/0x10
[ 31.567096] ? __pfx_kvm_flush_tlb_multi+0x10/0x10
[ 31.567653] ? arch_tlbbatch_flush+0x2b9/0x430
[ 31.568175] shrink_folio_list+0x122b/0x35f0
[ 31.568680] ? __pfx_shrink_folio_list+0x10/0x10
[ 31.569236] ? __lock_acquire+0x1a03/0x5cc0
[ 31.569747] ? __lock_acquire+0x1a03/0x5cc0
[ 31.570244] reclaim_folio_list+0xd9/0x2f0
[ 31.570710] ? __pfx___lock_acquire+0x10/0x10
[ 31.571213] ? __pfx_reclaim_folio_list+0x10/0x10
[ 31.571787] reclaim_pages+0x39c/0x5b0
[ 31.572271] ? __pfx_reclaim_pages+0x10/0x10
[ 31.572780] madvise_cold_or_pageout_pte_range+0x1297/0x2450
[ 31.573478] ? __pfx_madvise_cold_or_pageout_pte_range+0x10/0x10
[ 31.574172] ? __pfx_madvise_cold_or_pageout_pte_range+0x10/0x10
[ 31.574841] walk_pgd_range+0x11a8/0x21e0
[ 31.575333] ? __pfx_walk_pgd_range+0x10/0x10
[ 31.575845] __walk_page_range+0x637/0x760
[ 31.576316] ? find_vma+0xc5/0x140
[ 31.576715] ? __pfx_find_vma+0x10/0x10
[ 31.577187] ? __this_cpu_preempt_check+0x21/0x30
[ 31.577736] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[ 31.578344] ? walk_page_test+0xac/0x1c0
[ 31.578803] walk_page_range+0x3a0/0x830
[ 31.579262] ? __pfx_walk_page_range+0x10/0x10
[ 31.579790] madvise_pageout+0x37d/0x8f0
[ 31.580289] ? __pfx_madvise_pageout+0x10/0x10
[ 31.580794] ? mas_prev+0x103/0x650
[ 31.581239] ? __this_cpu_preempt_check+0x21/0x30
[ 31.581756] ? lock_is_held_type+0xf0/0x150
[ 31.582222] do_madvise.part.0+0xaf6/0x2ae0
[ 31.582674] ? __pfx___lock_acquire+0x10/0x10
[ 31.583185] ? __pfx_do_madvise.part.0+0x10/0x10
[ 31.583713] ? lock_release+0x417/0x7e0
[ 31.584156] ? __pfx_lock_release+0x10/0x10
[ 31.584635] ? __this_cpu_preempt_check+0x21/0x30
[ 31.585216] ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0
[ 31.585925] ? lockdep_hardirqs_on+0x8a/0x110
[ 31.586420] ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0
[ 31.587114] ? trace_hardirqs_on+0x26/0x120
[ 31.587597] ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0
[ 31.588330] ? __sanitizer_cov_trace_cmp4+0x1a/0x20
[ 31.588881] ? ktime_get_coarse_real_ts64+0xbf/0xf0
[ 31.589473] ? __audit_syscall_entry+0x39e/0x500
[ 31.590026] __x64_sys_madvise+0x13a/0x180
[ 31.590500] do_syscall_64+0x42/0xf0
[ 31.590916] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 31.591482] RIP: 0033:0x7f47e583ee5d
[ 31.591890] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
[ 31.593912] RSP: 002b:00007f47e5bb6df8 EFLAGS: 00000297 ORIG_RAX: 000000000000001c
[ 31.594742] RAX: ffffffffffffffda RBX: 00007f47e5bb7640 RCX: 00007f47e583ee5d
[ 31.595516] RDX: 0000000000000015 RSI: 0000000000004000 RDI: 0000000020ffb000
[ 31.596329] RBP: 00007f47e5bb6e20 R08: 0000000000000000 R09: 0000000000000000
[ 31.597140] R10: 0000000000000000 R11: 0000000000000297 R12: 00007f47e5bb7640
[ 31.597927] R13: 0000000000000000 R14: 00007f47e589f560 R15: 0000000000000000
[ 31.598725] </TASK>
[ 31.598986] irq event stamp: 1493
[ 31.599358] hardirqs last enabled at (1501): [<ffffffff8142b0e5>] console_unlock+0x2d5/0x310
[ 31.600278] hardirqs last disabled at (1508): [<ffffffff8142b0ca>] console_unlock+0x2ba/0x310
[ 31.601242] softirqs last enabled at (1318): [<ffffffff812674f8>] __irq_exit_rcu+0xa8/0x110
[ 31.602187] softirqs last disabled at (1313): [<ffffffff812674f8>] __irq_exit_rcu+0xa8/0x110
[ 31.603220] ---[ end trace 0000000000000000 ]---
"

Hope it helps.

Thanks!

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
// Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
./configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install

Best Regards,
Thanks!