Re: [GIT PULL] execve updates for v6.8-rc1

From: Al Viro
Date: Thu Jan 11 2024 - 04:47:26 EST


On Tue, Jan 09, 2024 at 07:54:30PM -0800, Linus Torvalds wrote:

> Al, comments? We *could* just special-case the execve() code not to
> use do_filp_open() and avoid this issue that way, but it does feel
> like even the regular open() case is pessimal with that whole RCU
> situation.

Two things, both related to ->atomic_open():
* we pass struct file to ->atomic_open(), so that it either opens
the sucker _or_ stores the resulting dentry in it (if ->atomic_open() bails
and tells us to use such-and-such dentry, other than the one it had been
given).
* cleanup logics becomes interesting if we try to keep struct
file from pass to pass. Sure, if it had never been touched _or_ if it had
only been fed to finish_no_open() (i.e. ->atomic_open() bailout path) -
no problem, we can reuse it. But once it has hit do_dentry_open(), the
things get fishy. We *must* fput() it if we got to setting FMODE_OPENED -
no plausible way around that. But what about the case when we fail
e.g. inside ->open()? Currently we undo just enough for fput() to do
the right thing without FMODE_OPENED, but e.g. security_file_open() has
no undoing for it. Having it called twice on the same struct file might
or might not work on all LSMs, but they hadn't been exposed to that until
now.

We could pass struct file ** to path_openat(), with
file = *fp;
if (!file) {
file = alloc_empty_file(op->open_flag, current_cred());
if (IS_ERR(file))
return file;
*fp = file;
}
in the beginning and have an extra flag that would be
set as soon as we hit do_dentry_open(). Then we could make the fput()
in path_openat() failure handling conditional upon that flag.

Doable, but really not pretty, especially since we'd need to massage
the caller as well... Less painful variant is
if (error == -ECHILD && (flags & LOOKUP_RCU))
return ERR_PTR(-ECHILD); // keep file for non-rcu pass
*fp = NULL;
fput(file);
...
on the way out; that won't help with -ESTALE side of things, but if we
hit *that*, struct file allocation overhead is really noise.

PS: apologies for late reply - had been sick since Saturday, just got more
or less back to normal.