Re: [PATCH v2] vfs: bypass may_create_in_sticky check on newly-created files if task has CAP_FOWNER

From: Christian Brauner
Date: Wed Jul 27 2022 - 10:33:25 EST


On Wed, Jul 27, 2022 at 10:00:14AM -0400, Jeff Layton wrote:
> From: Christian Brauner <brauner@xxxxxxxxxx>
>
> NFS server is exporting a sticky directory (mode 01777) with root
> squashing enabled. Client has protect_regular enabled and then tries to
> open a file as root in that directory. File is created (with ownership
> set to nobody:nobody) but the open syscall returns an error. The problem
> is may_create_in_sticky which rejects the open even though the file has
> already been created.
>
> Add a new condition to may_create_in_sticky. If the file was just
> created, then allow bypassing the ownership check if the task has
> CAP_FOWNER. With this change, the initial open of a file by root works,
> but later opens of the same file will fail.
>
> Note that we can contrive a similar situation by exporting with
> all_squash and opening the file as an unprivileged user. This patch does
> not fix that case. I suspect that that configuration is likely to be
> fundamentally incompatible with the protect_* sysctls enabled on the
> clients.
>
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829
> Reported-by: Yongchen Yang <yoyang@xxxxxxxxxx>
> Suggested-by: Christian Brauner <brauner@xxxxxxxxxx>
> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> ---
> fs/namei.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> Hi Christian,
>
> I left you as author here since this is basically identical to the patch
> you suggested. Let me know if that's an issue.

No, that's fine.

It feels pretty strange to be able to create a file and then not being
able to open it fwiw. But we have that basically with nodev already. And
we implicitly encode this in may_create_in_sticky() for this protected_*
stuff. Relaxing this through CAP_FOWNER makes sense as it's explicitly
thought to "Bypass permission checks on operations that normally require
the filesystem UID of the process to match the UID of the file".

One thing that I'm not sure about is something that Seth pointed out
namely whether there's any NFS server side race window that would render
FMODE_CREATED provided to may_create_in_sticky() inaccurate.