Re: [v6.1] kernel BUG in ext4_writepages

From: Muhammad Usama Anjum
Date: Mon Aug 14 2023 - 01:37:12 EST


On 8/14/23 10:31 AM, Muhammad Usama Anjum wrote:
> On 8/10/23 3:49 PM, Muhammad Usama Anjum wrote:
>> Hi,
>>
>> Syzbot has reporting hitting this bug on 6.1.18 and 5.15.101 LTS kernels
>> and provided reproducer as well.
>>
>> BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA));
>>
>> I've copied the same config and reproduced the bug on 6.1.18, 6.1.44 and
>> next-20230809.
>>
>> This part of code hasn't been changed from the time it was introduced
>> 4e7ea81db53465 ("ext4: restructure writeback path"). I'm not sure why the
>> inlined data is being destroyed before copying it somewhere else.
>>
>> Please consider this a report.
>>
>> Regards,
>> Muhammad Usama Anjum
>>
>>
>> On 3/13/23 11:34 AM, syzbot wrote:
>>> syzbot has found a reproducer for the following issue on:
>>>
>>> HEAD commit: 1cc3fcf63192 Linux 6.1.18
>>> git tree: linux-6.1.y
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10d4b342c80000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=157296d36f92ea19
>> ^ Kernel config
>>
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=a8068dd81edde0186829
>>> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
>>> userspace arch: arm64
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13512ec6c80000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15ca0ff4c80000
>> ^ reproducers. C reproducer reproduces the bug easily.
>>
>>>
>>> Downloadable assets:
>>> disk image: https://storage.googleapis.com/syzbot-assets/0e4c0d43698b/disk-1cc3fcf6.raw.xz
>>> vmlinux: https://storage.googleapis.com/syzbot-assets/a4de39d735de/vmlinux-1cc3fcf6.xz
>>> kernel image: https://storage.googleapis.com/syzbot-assets/82bab928f6e3/Image-1cc3fcf6.gz.xz
>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/bf2e21b96210/mount_0.gz
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+a8068dd81edde0186829@xxxxxxxxxxxxxxxxxxxxxxxxx
>>>
>>> ------------[ cut here ]------------
>>> kernel BUG at fs/ext4/inode.c:2746!
>>> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>>> Modules linked in:
>>> CPU: 0 PID: 11 Comm: kworker/u4:1 Not tainted 6.1.18-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
>>> Workqueue: writeback wb_workfn (flush-7:0)
>>> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> pc : ext4_writepages+0x35f4/0x35f8 fs/ext4/inode.c:2745
>>> lr : ext4_writepages+0x35f4/0x35f8 fs/ext4/inode.c:2745
>>> sp : ffff800019d16d40
>>> x29: ffff800019d17120 x28: ffff800008e691e4 x27: dfff800000000000
>>> x26: ffff0000de1f3ee0 x25: ffff800019d17590 x24: ffff800019d17020
>>> x23: ffff0000dd616000 x22: ffff800019d16f40 x21: ffff0000de1f4108
>>> x20: 0000008410000000 x19: 0000000000000001 x18: ffff800019d16a20
>>> x17: ffff80001572d000 x16: ffff8000083099b4 x15: 000000000000ba31
>>> x14: 00000000ffffffff x13: dfff800000000000 x12: 0000000000000001
>>> x11: ff80800008e6c7d8 x10: 0000000000000000 x9 : ffff800008e6c7d8
>>> x8 : ffff0000c099b680 x7 : 0000000000000000 x6 : 0000000000000000
>>> x5 : 0000000000000080 x4 : 0000000000000000 x3 : 0000000000000001
>>> x2 : 0000000000000000 x1 : 0000008000000000 x0 : 0000000000000000
>>> Call trace:
>>> ext4_writepages+0x35f4/0x35f8 fs/ext4/inode.c:2745
>>> do_writepages+0x2e8/0x56c mm/page-writeback.c:2469
>>> __writeback_single_inode+0x228/0x1ec8 fs/fs-writeback.c:1587
>>> writeback_sb_inodes+0x9c0/0x1844 fs/fs-writeback.c:1878
>>> wb_writeback+0x4f8/0x1580 fs/fs-writeback.c:2052
>>> wb_do_writeback fs/fs-writeback.c:2195 [inline]
>>> wb_workfn+0x460/0x11b8 fs/fs-writeback.c:2235
>>> process_one_work+0x868/0x16f4 kernel/workqueue.c:2289
>>> worker_thread+0x8e4/0xfec kernel/workqueue.c:2436
>>> kthread+0x24c/0x2d4 kernel/kthread.c:376
>>> ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
>>> Code: d4210000 97da5cfa d4210000 97da5cf8 (d4210000)
>>> ---[ end trace 0000000000000000 ]---
>>>
>>>
>
> The last refactoring was done by 4e7ea81db53465 on this code in 2013. The
> code segment in question is present from even before that. It means that
> this bug is present for several years. 4.14 is the most old kernel being
> maintained today. So it affects all current LTS and mainline kernels. I'll
> report 4e7ea81db53465 with regzbot for proper tracking. Thus probably the
> bug report will get associated with all LTS kernels as well.
>
> #regzbot title: Race condition between buffer write and page_mkwrite

#regzbot title: ext4: Race condition between buffer write and page_mkwrite

>
> #regzbot introduced: 4e7ea81db53465
>
> #regzbot monitor:
> https://lore.kernel.org/all/20230530134405.322194-1-libaokun1@xxxxxxxxxx
>

--
BR,
Muhammad Usama Anjum