Re: [PATCH 1/3] vfs: add new f_op->syncfs vector

From: Vivek Goyal
Date: Thu Dec 17 2020 - 11:17:19 EST


On Thu, Dec 17, 2020 at 10:57:28AM +0100, Jan Kara wrote:
> On Thu 17-12-20 00:49:35, Al Viro wrote:
> > [Christoph added to Cc...]
> > On Wed, Dec 16, 2020 at 06:31:47PM -0500, Vivek Goyal wrote:
> > > Current implementation of __sync_filesystem() ignores the return code
> > > from ->sync_fs(). I am not sure why that's the case. There must have
> > > been some historical reason for this.
> > >
> > > Ignoring ->sync_fs() return code is problematic for overlayfs where
> > > it can return error if sync_filesystem() on upper super block failed.
> > > That error will simply be lost and sycnfs(overlay_fd), will get
> > > success (despite the fact it failed).
> > >
> > > If we modify existing implementation, there is a concern that it will
> > > lead to user space visible behavior changes and break things. So
> > > instead implement a new file_operations->syncfs() call which will
> > > be called in syncfs() syscall path. Return code from this new
> > > call will be captured. And all the writeback error detection
> > > logic can go in there as well. Only filesystems which implement
> > > this call get affected by this change. Others continue to fallback
> > > to existing mechanism.
> >
> > That smells like a massive source of confusion down the road. I'd just
> > looked through the existing instances; many always return 0, but quite
> > a few sometimes try to return an error:
> > fs/btrfs/super.c:2412: .sync_fs = btrfs_sync_fs,
> > fs/exfat/super.c:204: .sync_fs = exfat_sync_fs,
> > fs/ext4/super.c:1674: .sync_fs = ext4_sync_fs,
> > fs/f2fs/super.c:2480: .sync_fs = f2fs_sync_fs,
> > fs/gfs2/super.c:1600: .sync_fs = gfs2_sync_fs,
> > fs/hfsplus/super.c:368: .sync_fs = hfsplus_sync_fs,
> > fs/nilfs2/super.c:689: .sync_fs = nilfs_sync_fs,
> > fs/ocfs2/super.c:139: .sync_fs = ocfs2_sync_fs,
> > fs/overlayfs/super.c:399: .sync_fs = ovl_sync_fs,
> > fs/ubifs/super.c:2052: .sync_fs = ubifs_sync_fs,
> > is the list of such. There are 4 method callers:
> > dquot_quota_sync(), dquot_disable(), __sync_filesystem() and
> > sync_fs_one_sb(). For sync_fs_one_sb() we want to ignore the
> > return value; for __sync_filesystem() we almost certainly
> > do *not* - it ends with return __sync_blockdev(sb->s_bdev, wait),
> > after all. The question for that one is whether we want
> > __sync_blockdev() called even in case of ->sync_fs() reporting
> > a failure, and I suspect that it's safer to call it anyway and
> > return the first error value we'd got. No idea about quota
> > situation.
>
> WRT quota situation: All the ->sync_fs() calls there are due to cache
> coherency reasons (we need to get quota changes to disk, then prune quota
> files's page cache, and then userspace can read current quota structures
> from the disk). We don't want to fail dquot_disable() just because caches
> might be incoherent so ignoring ->sync_fs() return value there is fine.
> With dquot_quota_sync() it might make some sense to return the error -
> that's just a backend for Q_SYNC quotactl(2). OTOH I'm not sure anybody
> really cares - Q_SYNC is rarely used.

Thanks Jan. May be I will leave dquot_quota_sync() untouched for now. When
somebody has a need to capture return code from ->sync_fs() there, it
can be easily added.

Vivek