Re: race between vfs_rename and do_linkat (mv and link)

From: Miklos Szeredi
Date: Tue Feb 15 2022 - 04:56:47 EST

Next message: Lukasz Luba: "Re: [PATCH v5] drivers: thermal: clear all mitigation when thermal zone is disabled"
Previous message: Paul Kocialkowski: "Re: [PATCH v2 37/66] media: sun6i-csi: Move power management to runtime pm in capture"
In reply to: Xavier Roche: "fs: race between vfs_rename and do_linkat (mv and link)"
Next in thread: Al Viro: "Re: race between vfs_rename and do_linkat (mv and link)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 14 Feb 2022 at 22:07, Xavier Roche <xavier.roche@xxxxxxxxxxx> wrote:
>
> There has been a longstanding race condition between vfs_rename and do_linkat,
> when those operations are done in parallel:
>
> 1. Moving a file to an existing target file (eg. mv file target)
> 2. Creating a link from the target file to a third file (eg. ln target link)
>
> A typical example would be (1) a regular process putting a new version
> of a database in place and (2) a regular process backuping the live
> database by hardlinking it.
>
> My understanding is that as the target file is never erased on client
> side, but just replaced, the link should never fail.
>
> The issue seem to lie inside vfs_link (fs/namei.c):
> inode_lock(inode);
> /* Make sure we don't allow creating hardlink to an unlinked file */
> if (inode->i_nlink == 0 && !(inode->i_state & I_LINKABLE))
> error = -ENOENT;
>
> The possible answer is that the inode refcount is zero because the
> file has just been replaced concurrently, old file being erased, and
> as such, the link operation is failing.
>
> The race appears to have been introduced by aae8a97d3ec30, to fix
> _another_ race between unlink and link (but I'm not sure to understand
> what were the implications).
>
> Reverting the inode->i_nlink == 0 section "fixes" the issue, but would
> probably reintroduce this another issue.
>
> At this point I don't know what would be the best way to fix this issue.

Doing "lock_rename() + lookup last components" would fix this race.
If this was only done on retry, then that would prevent possible
performance regressions, at the cost of extra complexity.

Thanks,
Miklos

Next message: Lukasz Luba: "Re: [PATCH v5] drivers: thermal: clear all mitigation when thermal zone is disabled"
Previous message: Paul Kocialkowski: "Re: [PATCH v2 37/66] media: sun6i-csi: Move power management to runtime pm in capture"
In reply to: Xavier Roche: "fs: race between vfs_rename and do_linkat (mv and link)"
Next in thread: Al Viro: "Re: race between vfs_rename and do_linkat (mv and link)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]