Re: [PATCH] nfsd: properly tear down server when write_ports fails

From: Chuck Lever
Date: Tue Dec 12 2023 - 09:06:09 EST


On Tue, Dec 12, 2023 at 08:50:46AM -0500, Chuck Lever wrote:
> On Mon, Dec 11, 2023 at 06:11:04PM -0500, Jeff Layton wrote:
> > On Tue, 2023-12-12 at 09:59 +1100, NeilBrown wrote:
> > > On Tue, 12 Dec 2023, Jeff Layton wrote:
> > > > When the initial write to the "portlist" file fails, we'll currently put
> > > > the reference to the nn->nfsd_serv, but leave the pointer intact. This
> > > > leads to a UAF if someone tries to write to "portlist" again.
> > > >
> > > > Simple reproducer, from a host with nfsd shut down:
> > > >
> > > > # echo "foo 2049" > /proc/fs/nfsd/portlist
> > > > # echo "foo 2049" > /proc/fs/nfsd/portlist
> > > >
> > > > The kernel will oops on the second one when it trips over the dangling
> > > > nn->nfsd_serv pointer. There is a similar bug in __write_ports_addfd.
> > > >
> > > > This patch fixes it by adding some extra logic to nfsd_put to ensure
> > > > that nfsd_last_thread is called prior to putting the reference when the
> > > > conditions are right.
> > > >
> > > > Fixes: 9f28a971ee9f ("nfsd: separate nfsd_last_thread() from nfsd_put()")
> > > > Closes: https://issues.redhat.com/browse/RHEL-19081
> > > > Reported-by: Zhi Li <yieli@xxxxxxxxxx>
> > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > ---
> > > > This should probably go to stable, but we'll need to backport for v6.6
> > > > since older kernels don't have nfsd_nl_rpc_status_get_done. We should
> > > > just be able to drop that hunk though.
> > > > ---
> > > > fs/nfsd/nfsctl.c | 32 ++++++++++++++++++++++++++++----
> > > > fs/nfsd/nfsd.h | 8 +-------
> > > > fs/nfsd/nfssvc.c | 2 +-
> > > > 3 files changed, 30 insertions(+), 12 deletions(-)
> > >
> > > This is much the same as
> > >
> > > https://lore.kernel.org/linux-nfs/20231030011247.9794-2-neilb@xxxxxxx/
> > >
> > > It seems that didn't land. Maybe I dropped the ball...
> >
> > Indeed it is. I thought the problem seemed familiar. Your set is
> > considerably more comprehensive. Looks like I even sent some Reviewed-
> > bys when you sent it.
> >
> > Chuck, can we get these in or was there a problem with them?
>
> Offhand, I'd say either I was waiting for some review comments
> to be addressed or the mail got lost (vger or Exchange or I
> accidentally deleted the series). I'll go take a look.

I reviewed the thread:

https://lore.kernel.org/linux-nfs/20231030011247.9794-1-neilb@xxxxxxx/

>From the looks of it, I was expecting Neil to address a couple of
review comments and repost. These are the two comments that stand
out to me now:

On 1/5:

> > Then let's add
> >
> > Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount")
> >
> > to this one, since it addresses a crasher seen in the wild.
>
> Sounds good.
>
> > > but it won't fix the hinky error cleanup in nfsd_svc. It looks like that
> > > does get fixed in patch #4 though, so I'm not too concerned.
> >
> > Does that fix also need to be backported?
>
> I think so, but we might want to split that out into a more targeted
> patch and apply it ahead of the rest of the series. Our QA folks seem to
> be able to hit the problem somehow, so it's likely to be triggered by
> people in the field too.

This last paragraph requests a bit of reorganization to enable an
easier backport.

And on 2/5:

> > > > +struct pool_private {
> > > > + struct svc_serv *(*get_serv)(struct seq_file *, bool);
> > >
> > > This bool is pretty ugly. I think I'd rather see two operations here
> > > (get_serv/put_serv). Also, this could use a kerneldoc comment.
> >
> > I agree that bool is ugly, but two function pointers as function args
> > seemed ugly, and stashing them in 'struct svc_serv' seemed ugly.
> > So I picked one. I'd be keen to find an approach that didn't require a
> > function pointer.
> >
> > Maybe sunrpc could declare
> >
> > struct svc_ref {
> > struct mutex mutex;
> > struct svc_serv *serv;
> > }
> >
> > and nfsd could use one of those instead of nfsd_mutex and nfsd_serv, and
> > pass a pointer to it to the open function.
> >
> > But then the mutex would have to be in the per-net structure. And maybe
> > that isn't a bad idea, but it is a change...
> >
> > I guess I could pass pointers to nfsd_mutex and nn->nfsd_serv to the
> > open function....
> >
> > Any other ideas?
>
> I think just passing two function pointers to svc_pool_stats_open, and
> storing them both in the serv is the best solution (for now). Like you
> said, there are no clean options here. That function only has one caller
> though, so at least the nastiness will be confined to that.
>
> Moving the mutex to be per-net does make a lot of sense, but I think
> that's a separate project. If you decide to do that and it allows you to
> make a simpler interface for handling the get/put_serv pointers, then
> the interface can be reworked at that point.

The other requests I see in that thread have already been answered
adequately.


--
Chuck Lever