Re: [v6.1] kernel BUG in ext4_writepages

From: Muhammad Usama Anjum
Date: Mon Aug 14 2023 - 01:32:52 EST


On 8/10/23 3:49 PM, Muhammad Usama Anjum wrote:
> Hi,
>
> Syzbot has reporting hitting this bug on 6.1.18 and 5.15.101 LTS kernels
> and provided reproducer as well.
>
> BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA));
>
> I've copied the same config and reproduced the bug on 6.1.18, 6.1.44 and
> next-20230809.
>
> This part of code hasn't been changed from the time it was introduced
> 4e7ea81db53465 ("ext4: restructure writeback path"). I'm not sure why the
> inlined data is being destroyed before copying it somewhere else.
>
> Please consider this a report.
>
> Regards,
> Muhammad Usama Anjum
>
>
> On 3/13/23 11:34 AM, syzbot wrote:
>> syzbot has found a reproducer for the following issue on:
>>
>> HEAD commit: 1cc3fcf63192 Linux 6.1.18
>> git tree: linux-6.1.y
>> console output: https://syzkaller.appspot.com/x/log.txt?x=10d4b342c80000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=157296d36f92ea19
> ^ Kernel config
>
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a8068dd81edde0186829
>> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
>> userspace arch: arm64
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13512ec6c80000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15ca0ff4c80000
> ^ reproducers. C reproducer reproduces the bug easily.
>
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/0e4c0d43698b/disk-1cc3fcf6.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/a4de39d735de/vmlinux-1cc3fcf6.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/82bab928f6e3/Image-1cc3fcf6.gz.xz
>> mounted in repro: https://storage.googleapis.com/syzbot-assets/bf2e21b96210/mount_0.gz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+a8068dd81edde0186829@xxxxxxxxxxxxxxxxxxxxxxxxx
>>
>> ------------[ cut here ]------------
>> kernel BUG at fs/ext4/inode.c:2746!
>> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>> Modules linked in:
>> CPU: 0 PID: 11 Comm: kworker/u4:1 Not tainted 6.1.18-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
>> Workqueue: writeback wb_workfn (flush-7:0)
>> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : ext4_writepages+0x35f4/0x35f8 fs/ext4/inode.c:2745
>> lr : ext4_writepages+0x35f4/0x35f8 fs/ext4/inode.c:2745
>> sp : ffff800019d16d40
>> x29: ffff800019d17120 x28: ffff800008e691e4 x27: dfff800000000000
>> x26: ffff0000de1f3ee0 x25: ffff800019d17590 x24: ffff800019d17020
>> x23: ffff0000dd616000 x22: ffff800019d16f40 x21: ffff0000de1f4108
>> x20: 0000008410000000 x19: 0000000000000001 x18: ffff800019d16a20
>> x17: ffff80001572d000 x16: ffff8000083099b4 x15: 000000000000ba31
>> x14: 00000000ffffffff x13: dfff800000000000 x12: 0000000000000001
>> x11: ff80800008e6c7d8 x10: 0000000000000000 x9 : ffff800008e6c7d8
>> x8 : ffff0000c099b680 x7 : 0000000000000000 x6 : 0000000000000000
>> x5 : 0000000000000080 x4 : 0000000000000000 x3 : 0000000000000001
>> x2 : 0000000000000000 x1 : 0000008000000000 x0 : 0000000000000000
>> Call trace:
>> ext4_writepages+0x35f4/0x35f8 fs/ext4/inode.c:2745
>> do_writepages+0x2e8/0x56c mm/page-writeback.c:2469
>> __writeback_single_inode+0x228/0x1ec8 fs/fs-writeback.c:1587
>> writeback_sb_inodes+0x9c0/0x1844 fs/fs-writeback.c:1878
>> wb_writeback+0x4f8/0x1580 fs/fs-writeback.c:2052
>> wb_do_writeback fs/fs-writeback.c:2195 [inline]
>> wb_workfn+0x460/0x11b8 fs/fs-writeback.c:2235
>> process_one_work+0x868/0x16f4 kernel/workqueue.c:2289
>> worker_thread+0x8e4/0xfec kernel/workqueue.c:2436
>> kthread+0x24c/0x2d4 kernel/kthread.c:376
>> ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
>> Code: d4210000 97da5cfa d4210000 97da5cf8 (d4210000)
>> ---[ end trace 0000000000000000 ]---
>>
>>

The last refactoring was done by 4e7ea81db53465 on this code in 2013. The
code segment in question is present from even before that. It means that
this bug is present for several years. 4.14 is the most old kernel being
maintained today. So it affects all current LTS and mainline kernels. I'll
report 4e7ea81db53465 with regzbot for proper tracking. Thus probably the
bug report will get associated with all LTS kernels as well.

#regzbot title: Race condition between buffer write and page_mkwrite

#regzbot introduced: 4e7ea81db53465

#regzbot monitor:
https://lore.kernel.org/all/20230530134405.322194-1-libaokun1@xxxxxxxxxx

--
BR,
Muhammad Usama Anjum