Re: [f2fs-dev] [PATCH] f2fs: fix missing mapping caused by the mount/umount race

From: Jaegeuk Kim
Date: Tue Aug 30 2022 - 16:54:58 EST


On 08/30, Chao Yu wrote:
> On 2022/8/30 5:52, Jaegeuk Kim wrote:
> > Sometimes we can get a cached meta_inode which has no aops yet. Let's set it
> > all the time to fix the below panic.
> >
> > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > Mem abort info:
> > ESR = 0x0000000086000004
> > EC = 0x21: IABT (current EL), IL = 32 bits
> > SET = 0, FnV = 0
> > EA = 0, S1PTW = 0
> > FSC = 0x04: level 0 translation fault
> > user pgtable: 4k pages, 48-bit VAs, pgdp=0000000109ee4000
> > [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> > Internal error: Oops: 86000004 [#1] PREEMPT SMP
> > Modules linked in:
> > CPU: 1 PID: 3045 Comm: syz-executor330 Not tainted 6.0.0-rc2-syzkaller-16455-ga41a877bc12d #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022
> > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : 0x0
> > lr : folio_mark_dirty+0xbc/0x208 mm/page-writeback.c:2748
> > sp : ffff800012783970
> > x29: ffff800012783970 x28: 0000000000000000 x27: ffff800012783b08
> > x26: 0000000000000001 x25: 0000000000000400 x24: 0000000000000001
> > x23: ffff0000c736e000 x22: 0000000000000045 x21: 05ffc00000000015
> > x20: ffff0000ca7403b8 x19: fffffc00032ec600 x18: 0000000000000181
> > x17: ffff80000c04d6bc x16: ffff80000dbb8658 x15: 0000000000000000
> > x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > x11: ff808000083e9814 x10: 0000000000000000 x9 : ffff8000083e9814
> > x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
> > x5 : ffff0000cbb19000 x4 : ffff0000cb3d2000 x3 : ffff0000cbb18f80
> > x2 : fffffffffffffff0 x1 : fffffc00032ec600 x0 : ffff0000ca7403b8
> > Call trace:
> > 0x0
> > set_page_dirty+0x38/0xbc mm/folio-compat.c:62
> > f2fs_update_meta_page+0x80/0xa8 fs/f2fs/segment.c:2369
> > do_checkpoint+0x794/0xea8 fs/f2fs/checkpoint.c:1522
> > f2fs_write_checkpoint+0x3b8/0x568 fs/f2fs/checkpoint.c:1679
> >
> > Cc: stable@xxxxxxxxxxxxxxx
> > Reported-by: syzbot+775a3440817f74fddb8c@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Signed-off-by: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
> > ---
> > fs/f2fs/inode.c | 13 ++++++++-----
> > 1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > index 6d11c365d7b4..1feb0a8a699e 100644
> > --- a/fs/f2fs/inode.c
> > +++ b/fs/f2fs/inode.c
> > @@ -490,10 +490,7 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
> > if (!inode)
> > return ERR_PTR(-ENOMEM);
> > - if (!(inode->i_state & I_NEW)) {
> > - trace_f2fs_iget(inode);
> > - return inode;
> > - }
> > + /* We can see an old cached inode. Let's set the aops all the time. */
>
> Why an old cached inode (has no I_NEW flag) has NULL a_ops pointer? If it is a bad
> inode, it should be unhashed before unlock_new_inode().

I'm trying to dig further tho, it's not a bad inode, nor I_FREEING | I_CLEAR.
It's very werid that thie meta inode is found in newly created superblock by
the global hash table. I've checked that the same superblock pointer was used
in the previous tests, but inode was evictied all the time.

>
> Thanks,
>
> > if (ino == F2FS_NODE_INO(sbi) || ino == F2FS_META_INO(sbi))
> > goto make_now;
> > @@ -502,6 +499,11 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
> > goto make_now;
> > #endif
> > + if (!(inode->i_state & I_NEW)) {
> > + trace_f2fs_iget(inode);
> > + return inode;
> > + }
> > +
> > ret = do_read_inode(inode);
> > if (ret)
> > goto bad_inode;
> > @@ -557,7 +559,8 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino)
> > file_dont_truncate(inode);
> > }
> > - unlock_new_inode(inode);
> > + if (inode->i_state & I_NEW)
> > + unlock_new_inode(inode);
> > trace_f2fs_iget(inode);
> > return inode;