Re: [RFC] atomic open(..., O_CREAT | ...)

From: Trond Myklebust
Date: Tue Aug 09 2005 - 13:39:33 EST


ty den 09.08.2005 Klokka 19:44 (+0200) skreiv Miklos Szeredi:

> > There is quite a bit of code out there that assumes it is free to stuff
> > things into nd->mnt and nd->dentry. Some of it is Al Viro's code, some
> > of it is from other people.
> > For instance, the ESTALE handling will just save nd->mnt/nd->dentry
> > before calling __link_path_walk(), then restore + retry if the ESTALE
> > error comes up.
>
> Yeah, but how is that relevant to the fact, that after
> path_release_*() _nothing_ will be valid in the nameidata, not
> nd->mnt, not nd->dentry, and not nd->intent.open.file. So what's the
> point in setting it to NULL if it must never be used anyway?

path_release() does _not_ invalidate the nameidata. Look for instance at
__emul_lookup_dentry(), which clearly makes use of that fact.

> > > If there's any chance that the path walk restart thing will invoke the
> > > filesystems open code twice (I doubt it), then the filesystem must
> > > make sure to check intent.open.file, whether it has already been set,
> > > and fput() it before setting it another time.
> >
> > The only user of that code is NFSv4, and we will never even try to
> > allocate a file if the OPEN call returned an ESTALE.
>
> That's why I doubted that this is an issue.
>
> > > > Why do we want to keep this behaviour? It is undocumented, it is
> > > > non-posix, and it appears to do nothing you cannot do with the existing
> > > > access() call.
> > > >
> > > > Are there any applications using it? If so, which ones, and why?
> > >
> > > I have absolutely no idea.
> > >
> > > Looking closer, there's a problem with O_TRUNC as well:
> > >
> > > namei_flags = flags;
> > > if ((namei_flags+1) & O_ACCMODE)
> > > namei_flags++;
> > > if (namei_flags & O_TRUNC)
> > > namei_flags |= 2;
> > >
> > > So if flags is O_RDONLY|O_TRUNC, intent.open.flags will be
> > > FMODE_WRITE|FMODE_READ|O_TRUNC, but filp->f_mode will be FMODE_READ.
> >
> > That is a bug that needs to be fixed in the intent.open.flags. We don't
> > ever want to be opening the file for writing at the filesystem level
> > when the user specified open for read.
>
> No, it's being opened for read. The namei_flags (and hence
> intent.open.flags) will have FMODE_WRITE, so that the permission is
> checked for write.

Firstly, the open_namei() flags field is not a "permissions" field. It
contains open mode information. The calculation of the open permissions
flags is done by open_namei() itself.

Secondly, what advantage is there in allowing callers of open_namei() to
be able to override the MAY_WRITE check when doing open(O_TRUNC)? This
is a calculation that should be done _once_ in order to always get it
right, and it should therefore be done in open_namei() together with the
rest of the permissions calculation.

Cheers,
Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/