Re: [PATCH] xfs: introduce protection for drop nlink

From: Dave Chinner
Date: Mon Aug 28 2023 - 01:22:27 EST


On Mon, Aug 28, 2023 at 11:29:51AM +0800, cheng.lin130@xxxxxxxxxx wrote:
> > On Sat, Aug 26, 2023 at 10:54:11PM +0800, cheng.lin130@xxxxxxxxxx wrote:
> > > > > In the old kernel version, this situation was
> > > > > encountered, but I don't know how it happened. It was already a scene
> > > > > with directory errors: "Too many links".
> > How do you overflow the directory link count in XFS? You can't fit
> > 2^31 unique names in the directory data segment - the directory will
> > ENOSPC at 32GB of name data, and that typically occurs with at most
> > 300-500 million dirents (depending on name lengths) in the
> > directory.
> > IOWs, normal operation shouldn't be able overflow the directory link
> > count at all, and so underruns shouldn't occur, either.
> Customer's explanation: in the nlink incorrect directory, not many directories
> will be created, and normally there are only 2 regular files.
> And only found this one directory with incorrect nlink when xfs_repair.
> systemd-fsck[5635]: Phase 2 - using internal log
> systemd-fsck[5635]: - zero log...
> systemd-fsck[5635]: - scan filesystem freespace and inode maps...
> systemd-fsck[5635]: agi unlinked bucket 9 is 73 in ag 22 (inode=23622320201)

So the directory inode is on the unlinked list, as I suggested it
would be.

> systemd-fsck[5635]: - 21:46:00: scanning filesystem freespace - 32 of 32 allocation groups done
> systemd-fsck[5635]: - found root inode chunk
> ...

How many other inodes were repaired or trashed or moved to
lost+found?

> systemd-fsck[5635]: Phase 7 - verify and correct link counts...
> systemd-fsck[5635]: resetting inode 23622320201 nlinks from 4294967284 to 2

The link count of the directory inode on the unlinked list was
actually -12, so this isn't an "off by one" error. It's still just 2
adjacent bits being cleared when they shouldn't have been, though.

What is the xfs_info (or mkfs) output for the filesystem that this
occurred on?

.....

> If it's just a incorrect count of one dicrectory, after ignore it, the fs
> can work normally(with error). Is it worth stopping the entire fs
> immediately for this condition?

The inode is on the unlinked list with a non-zero link count. That
means it cannot be removed from the unlinked list (because the inode
will not be freed during inactivation) and so the unlinked list is
effectively corrupt. Anything that removes an inode or creates a
O_TMPFILE or uses RENAME_WHITEOUT can trip over this corrupt
unlinked list and have things go from bad to worse. Hence the
corruption is not limited to the directory inode or operations
involving that directory inode. We generally shut down the
filesystem when this sort of corruption occurs - it needs to be
repaired ASAP, otherwise other stuff will randomly fail and you'll
still end up with a shut down filesystem. Better to fail fast in
corruption cases than try to ignore it and vainly hope that
everything will work out for the best....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx