Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM

From: Jeff Layton
Date: Fri Jul 08 2016 - 12:18:03 EST


On Fri, 2016-07-08 at 11:59 -0400, Oleg Drokin wrote:
> On Jul 8, 2016, at 11:53 AM, Jeff Layton wrote:
>
> > On Fri, 2016-07-08 at 11:14 -0400, Oleg Drokin wrote:
> > > On Jul 8, 2016, at 7:02 AM, Jeff Layton wrote:
> > >
> > > > On Thu, 2016-07-07 at 21:47 -0400, Oleg Drokin wrote:
> > > > > It looks like we are bit overzealous about failing mkdir/create/mknod
> > > > > with permission denied if the parent dir is not writeable.
> > > > > Need to make sure the name does not exist first, because we need to
> > > > > return EEXIST in that case.
> > > > >
> > > > > Signed-off-by: Oleg Drokin <green@xxxxxxxxxxxxxx>
> > > > > ---
> > > > > A very similar problem exists with symlinks, but the patch is more
> > > > > involved, so assuming this one is ok, I'll send a symlink one separately.
> > > > > Âfs/nfsd/nfs4proc.c |ÂÂ6 +++++-
> > > > > Âfs/nfsd/vfs.cÂÂÂÂÂÂ| 11 ++++++++++-
> > > > > Â2 files changed, 15 insertions(+), 2 deletions(-)
> > > > >
> > > >
> > > >
> > > > nit: subject says EPERM, but I think you mean EACCES. The mnemonic I've
> > > > always used is that EPERM is "permanent". IOW, changing permissions
> > > > won't ever allow the user to do something. For instance, unprivileged
> > > > users can never chown a file, so they should get back EPERM there. When
> > > > a directory isn't writeable on a create they should get EACCES since
> > > > they could do the create if the directory were writeable.
> > >
> > > Hm, I see, thanks.
> > > Confusing that you get "Permission denied" from perror ;)
> > >
> >
> > Yes indeed. It's a subtle and confusing distinction.
> >
> > > > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > > > index de1ff1d..0067520 100644
> > > > > --- a/fs/nfsd/nfs4proc.c
> > > > > +++ b/fs/nfsd/nfs4proc.c
> > > > > @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > > > > Â
> > > > > Â fh_init(&resfh, NFS4_FHSIZE);
> > > > > Â
> > > > > + /*
> > > > > + Â* We just check thta parent is accessible here, nfsd_* do their
> > > > > + Â* own access permission checks
> > > > > + Â*/
> > > > > Â status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
> > > > > - ÂÂÂNFSD_MAY_CREATE);
> > > > > + ÂÂÂNFSD_MAY_EXEC);
> > > > > Â if (status)
> > > > > Â return status;
> > > > > Â
> > > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > > > index 6fbd81e..6a45ec6 100644
> > > > > --- a/fs/nfsd/vfs.c
> > > > > +++ b/fs/nfsd/vfs.c
> > > > > @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > > > Â if (isdotent(fname, flen))
> > > > > Â goto out;
> > > > > Â
> > > > > - err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
> > > > > + /*
> > > > > + Â* Even though it is a create, first we see if we are even allowed
> > > > > + Â* to peek inside the parent
> > > > > + Â*/
> > > > > + err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
> > > > > Â if (err)
> > > > > Â goto out;
> > > > > Â
> > > > > @@ -1211,6 +1215,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > > > > Â goto out;Â
> > > > > Â }
> > > > > Â
> > > > > + /* Now let's see if we actually have permissions to create */
> > > > > + err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
> > > > > + if (err)
> > > > > + goto out;
> > > > > +
> > > > > Â if (!(iap->ia_valid & ATTR_MODE))
> > > > > Â iap->ia_mode = 0;
> > > > > Â iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
> > > >
> > > >
> > > > Ouch. This means two nfsd_permission calls per create operation. If
> > > > it's necessary for correctness then so be it, but is it actually
> > > > documented anywhere (POSIX perhaps?) that we must prefer EEXIST over
> > > > EACCES in this situation?
> > >
> > > Opengroup manpage: http://pubs.opengroup.org/onlinepubs/009695399/functions/mkdir.html
> > > newer version is here:
> > > http://pubs.opengroup.org/onlinepubs/9699919799/
> > >
> > > They tell us that we absolutely must fail with EEXIST if the name is a symlink
> > > (so we need to lookup it anyway), and also that EEXIST is the failure code
> > > if the path exists.
> > >
> >
> > I'm not sure that that verbiage supersedes the fact that you don't have
> > write permissions on the directory. Does it?
>
> "If path names a symbolic link, mkdir() shall fail and set errno to [EEXIST]."
>
> This sounds pretty straightforward to me, no?
> Since it does not matter that we do not have write permissions here, because
> the name already exists.
>
> (also there are tons of applications that make this assumption when
> badly reimplementing their mkdir -p thing, I imagine they also have this same
> reading of the man page - this is why I even care about it).
>

I always have trouble with this sort of thing. Just because it's in
DESCRIPTION, does that make it supersede the part in ERRORS? IOW, I
think that's just telling you how to handle a symlink as the last
component, not that you have to do that before the permissions check.

Now that said, as a practical matter I do agree that EEXIST _is_
probably the more helpful error message. If there are applications that
rely on this then we probably should just take your patch.

Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxxxxxxx>