Re: [syzbot] [ext4?] BUG: sleeping function called from invalid context in ext4_update_super

From: Fabio M. De Francesco
Date: Sat Jun 10 2023 - 16:41:47 EST


On sabato 10 giugno 2023 15:52:55 CEST syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: f8dba31b0a82 Merge tag 'asym-keys-fix-for-linus-v6.4-rc5' ..
> git tree: upstream
>
> [...]
>
> Unfortunately, I don't have any reproducer for this issue yet.

Unfortunately :-(

> Downloadable assets:
>
> [...]
>
> EXT4-fs error (device loop4): ext4_get_group_info:331: comm syz-executor.4:
> invalid group 4294819419 BUG: sleeping function called from invalid context
> at include/linux/buffer_head.h:404 in_atomic(): 1, irqs_disabled(): 0,
> non_block: 0, pid: 21305, name: syz-executor.4 preempt_count: 1, expected: 0
> RCU nest depth: 0, expected: 0
> 5 locks held by syz-executor.4/21305:
> #0: ffff8880292c8460 (sb_writers#4){.+.+}-{0:0}, at: do_sendfile+0x5fb/
0xff0
> fs/read_write.c:1253 #1: ffff8880391da200
> (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: inode_lock
> include/linux/fs.h:775 [inline] #1: ffff8880391da200
> (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at:
> ext4_buffered_write_iter+0xaf/0x3a0 fs/ext4/file.c:283 #2: ffff8880391d9ec8
> (&ei->xattr_sem){++++}-{3:3}, at: ext4_write_lock_xattr fs/ext4/xattr.h:155
> [inline] #2: ffff8880391d9ec8 (&ei->xattr_sem){++++}-{3:3}, at:
> ext4_convert_inline_data_to_extent fs/ext4/inline.c:584 [inline] #2:
> ffff8880391d9ec8 (&ei->xattr_sem){++++}-{3:3}, at:
> ext4_try_to_write_inline_data+0x51d/0x1360 fs/ext4/inline.c:740 #3:
> ffff8880391da088 (&ei->i_data_sem){++++}-{3:3}, at:
> ext4_map_blocks+0x980/0x1cf0 fs/ext4/inode.c:616 #4: ffff88803944f018
> (&bgl->locks[i].lock){+.+.}-{2:2}, at: spin_trylock
> include/linux/spinlock.h:360 [inline] #4: ffff88803944f018
> (&bgl->locks[i].lock){+.+.}-{2:2}, at: ext4_lock_group fs/ext4/ext4.h:3407
> [inline] #4: ffff88803944f018 (&bgl->locks[i].lock){+.+.}-{2:2}, at:
> ext4_mb_try_best_found+0x1ca/0x5a0 fs/ext4/mballoc.c:2166 Preemption
disabled
> at:
> [<0000000000000000>] 0x0
> CPU: 0 PID: 21305 Comm: syz-executor.4 Not tainted
> 6.4.0-rc5-syzkaller-00002-gf8dba31b0a82 #0 Hardware name: Google Google
> Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
> __might_resched+0x5cf/0x780 kernel/sched/core.c:10153
> lock_buffer include/linux/buffer_head.h:404 [inline]
> ext4_update_super+0x93/0x1230 fs/ext4/super.c:6039
> ext4_commit_super+0xd0/0x4c0 fs/ext4/super.c:6117
> ext4_handle_error+0x5ee/0x8b0 fs/ext4/super.c:676

Well, I'm a new to filesystems. However, I'd like to test a change in
ext4_handle_error().

Currently I see that errors are handled according to the next snippet of code
from the above-mentioned function (please note that we are in atomic context):

if (continue_fs)
if (continue_fs && journal)
schedule_work(&EXT4_SB(sb)->s_error_work);
else
ext4_commit_super(sb);

If evaluates false, we directly call ext4_commit_super(), forgetting that,
AFAICS we are in atomic context.

Obviously, we know that ext4_update_super() calls lock_buffer(), which
might_sleep().

As I said I have only little experience with filesystems, so my question is:
despite the overhead, can we delete the check and do the following?

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 05fcecc36244..574b096de059 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -662,19 +662,8 @@ static void ext4_handle_error(struct super_block *sb,
bool force_ro, int error,
jbd2_journal_abort(journal, -EIO);
}

- if (!bdev_read_only(sb->s_bdev)) {
- save_error_info(sb, error, ino, block, func, line);
- /*
- * In case the fs should keep running, we need to writeout
- * superblock through the journal. Due to lock ordering
- * constraints, it may not be safe to do it right here so we
- * defer superblock flushing to a workqueue.
- */
- if (continue_fs && journal)
- schedule_work(&EXT4_SB(sb)->s_error_work);
- else
- ext4_commit_super(sb);
- }
+ if (!bdev_read_only(sb->s_bdev))
+ schedule_work(&EXT4_SB(sb)->s_error_work);

/*
* We force ERRORS_RO behavior when system is rebooting. Otherwise we

Am I missing something I'm not able to see here?
If not, I'll try this diff if and when Syzkaller provides a reproducer.

Thanks,

Fabio

> __ext4_error+0x277/0x3b0 fs/ext4/super.c:776
> ext4_get_group_info+0x382/0x3e0 fs/ext4/balloc.c:331
> ext4_mb_new_inode_pa+0x89c/0x1300 fs/ext4/mballoc.c:4915
> ext4_mb_try_best_found+0x3a1/0x5a0 fs/ext4/mballoc.c:2171
> ext4_mb_regular_allocator+0x3511/0x3c20 fs/ext4/mballoc.c:2784
> ext4_mb_new_blocks+0xe5f/0x44a0 fs/ext4/mballoc.c:5843
> ext4_alloc_branch fs/ext4/indirect.c:340 [inline]
> ext4_ind_map_blocks+0x10d7/0x29e0 fs/ext4/indirect.c:635
> ext4_map_blocks+0x9e7/0x1cf0 fs/ext4/inode.c:625
> _ext4_get_block+0x238/0x6a0 fs/ext4/inode.c:779
> __block_write_begin_int+0x548/0x1a50 fs/buffer.c:2064
> ext4_try_to_write_inline_data+0x7ed/0x1360 fs/ext4/inline.c:740
> ext4_write_begin+0x290/0x10b0 fs/ext4/inode.c:1147
> ext4_da_write_begin+0x300/0xa40 fs/ext4/inode.c:2893
> generic_perform_write+0x300/0x5e0 mm/filemap.c:3923
> ext4_buffered_write_iter+0x122/0x3a0 fs/ext4/file.c:289
> ext4_file_write_iter+0x1d6/0x1930
> do_iter_write+0x7b1/0xcb0 fs/read_write.c:860
> iter_file_splice_write+0x843/0xfe0 fs/splice.c:795
> do_splice_from fs/splice.c:873 [inline]
> direct_splice_actor+0xe7/0x1c0 fs/splice.c:1039
> splice_direct_to_actor+0x4c4/0xbd0 fs/splice.c:994
> do_splice_direct+0x283/0x3d0 fs/splice.c:1082
> do_sendfile+0x620/0xff0 fs/read_write.c:1254
> __do_sys_sendfile64 fs/read_write.c:1322 [inline]
> __se_sys_sendfile64+0x17c/0x1e0 fs/read_write.c:1308
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7f0ff0c8c169
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7
48
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
> 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f0ff1944168
> EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX:
> 00007f0ff0dabf80 RCX: 00007f0ff0c8c169
> RDX: 0000000000000000 RSI: 0000000000000007 RDI: 0000000000000006
> RBP: 00007f0ff0ce7ca1 R08: 0000000000000000 R09: 0000000000000000
> R10: 0001000000201005 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007ffe35f5084f R14: 00007f0ff1944300 R15: 0000000000022000
> </TASK>
> BUG: scheduling while atomic: syz-executor.4/21305/0x00000002
> 5 locks held by syz-executor.4/21305:
> #0: ffff8880292c8460 (sb_writers#4){.+.+}-{0:0}, at: do_sendfile+0x5fb/
0xff0
> fs/read_write.c:1253 #1: ffff8880391da200
> (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: inode_lock
> include/linux/fs.h:775 [inline] #1: ffff8880391da200
> (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at:
> ext4_buffered_write_iter+0xaf/0x3a0 fs/ext4/file.c:283 #2: ffff8880391d9ec8
> (&ei->xattr_sem){++++}-{3:3}, at: ext4_write_lock_xattr fs/ext4/xattr.h:155
> [inline] #2: ffff8880391d9ec8 (&ei->xattr_sem){++++}-{3:3}, at:
> ext4_convert_inline_data_to_extent fs/ext4/inline.c:584 [inline] #2:
> ffff8880391d9ec8 (&ei->xattr_sem){++++}-{3:3}, at:
> ext4_try_to_write_inline_data+0x51d/0x1360 fs/ext4/inline.c:740 #3:
> ffff8880391da088 (&ei->i_data_sem){++++}-{3:3}, at:
> ext4_map_blocks+0x980/0x1cf0 fs/ext4/inode.c:616 #4: ffff88803944f018
> (&bgl->locks[i].lock){+.+.}-{2:2}, at: spin_trylock
> include/linux/spinlock.h:360 [inline] #4: ffff88803944f018
> (&bgl->locks[i].lock){+.+.}-{2:2}, at: ext4_lock_group fs/ext4/ext4.h:3407
> [inline] #4: ffff88803944f018 (&bgl->locks[i].lock){+.+.}-{2:2}, at:
> ext4_mb_try_best_found+0x1ca/0x5a0 fs/ext4/mballoc.c:2166 Modules linked in:
> Preemption disabled at:
> [<0000000000000000>] 0x0
>
>
> ---

[...]