Re: [PATCH] ext4: Add extend check to prevent BUG() in ext4_es_end

From: Theodore Ts'o
Date: Mon Nov 28 2022 - 16:16:45 EST


On Tue, Oct 11, 2022 at 11:38:37AM -0700, Tadeusz Struk wrote:
> On 9/30/22 13:25, Tadeusz Struk wrote:
> > Syzbot reported an issue with ext4 extents. The reproducer creates
> > a corrupted ext4 fs image in memory, and mounts it as a loop device.
> > It invokes the ext4_cache_extents() and ext4_find_extent(), which
> > eventually triggers a BUG() in ext4_es_end() causing a kernel crash.
> > It triggers on mainline, and every kernel version back to v4.14.
> > Add a call ext4_ext_check_inode() in ext4_find_extent() to prevent
> > the crash.
> >
> > To: "Theodore Ts'o"<tytso@xxxxxxx>
> > Cc: "Andreas Dilger"<adilger.kernel@xxxxxxxxx>
> > Cc:<linux-ext4@xxxxxxxxxxxxxxx>
> > Cc:<linux-kernel@xxxxxxxxxxxxxxx>
> > Cc:<stable@xxxxxxxxxxxxxxx>
> >
> > Link:https://syzkaller.appspot.com/bug?id=641e7a4b900015c5d7a729d6cc1fba7a928a88f9
> > Reported-by:syzbot+a22dc4b0744ac658ed9b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Signed-off-by: Tadeusz Struk<tadeusz.struk@xxxxxxxxxx>
>
> Any comments/feedback on this one?

Hi, I can't replicate the problem using the syzkaller repro:

root@kvm-xfstests:/vdc# /vtmp/repro
[ 14.840406] loop0: detected capacity change from 0 to 1051
[ 14.840965] EXT4-fs (loop0): ext4_check_descriptors: Checksum for group 0 failed (60935!=0)
[ 14.841468] EXT4-fs error (device loop0): ext4_ext_check_inode:520: inode #2: comm repro: pblk 0 bad header/extent: invalid magic - magic 9fe4, entries 42, max 0(0), depth 0(0)
[ 14.842188] EXT4-fs (loop0): get root inode failed
[ 14.842401] EXT4-fs (loop0): mount failed

And by inspection, if there is a corrupted inode, it should have been
caught much sooner, in ext4_iget():

} else if (!ext4_has_inline_data(inode)) {
/* validate the block references in the inode */
if (!(EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY) &&
(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
(S_ISLNK(inode->i_mode) &&
!ext4_inode_is_fast_symlink(inode)))) {
if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
ret = ext4_ext_check_inode(inode);
else
ret = ext4_ind_check_inode(inode);
}
}
if (ret)
goto bad_inode;

... and this check has been around for quite some time.

It's much more efficient to check for a corrupted inode data structure
at the time when the inode is read into memory, as opposed to every
single time we try to lookup a logical->physical block map, in
ext4_find_extent().

If you can reproduce a failure on a modern kernel, please let me know,
but it looks like syzkaller had only reported it on androd-54,
android-5-10, linux-4.14, and linux-4.19 in any case.

Cheers,

- Ted