Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals

From: Robert Milkowski
Date: Mon Jan 27 2020 - 10:34:45 EST


On Mon, 27 Jan 2020 at 15:05, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
>
>
> > On Jan 27, 2020, at 9:45 AM, Robert Milkowski <rmilkowski@xxxxxxxxx> wrote:
> >
> > On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> >>
> >> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
> >>> Hi Robert,
> >>>
> >>> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
> >>>>> -----Original Message-----
> >>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >>>>> Sent: 30 December 2019 15:37
> >>>>> To: Robert Milkowski <rmilkowski@xxxxxxxxx>
> >>>>> Cc: Linux NFS Mailing List <linux-nfs@xxxxxxxxxxxxxxx>; Trond
> >>>>> Myklebust
> >>>>> <trond.myklebust@xxxxxxxxxxxxxxx>; Anna Schumaker
> >>>>> <anna.schumaker@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> >>>>> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
> >>>>> implicit
> >>>>> lease renewals
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
> >>>>>> rmilkowski@xxxxxxxxx>
> >>>>> wrote:
> >>>>>> From: Robert Milkowski <rmilkowski@xxxxxxxxx>
> >>>>>>
> >>>>>> Currently, each time nfs4_do_fsinfo() is called it will do an
> >>>>>> implicit
> >>>>>> NFS4 lease renewal, which is not compliant with the NFS4
> >>>>> specification.
> >>>>>> This can result in a lease being expired by an NFS server.
> >>>>>>
> >>>>>> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> >>>>>> leases")
> >>>>>> introduced implicit client lease renewal in nfs4_do_fsinfo(),
> >>>>>> which
> >>>>>> can result in the NFSv4.0 lease to expire on a server side, and
> >>>>>> servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
> >>>>>>
> >>>>>> This can easily be reproduced by frequently unmounting a sub-
> >>>>>> mount,
> >>>>>> then stat'ing it to get it mounted again, which will delay or
> >>>>>> even
> >>>>>> completely prevent client from sending RENEW operations if no
> >>>>>> other
> >>>>>> NFS operations are issued. Eventually nfs server will expire
> >>>>>> client's
> >>>>>> lease and return an error on file access or next RENEW.
> >>>>>>
> >>>>>> This can also happen when a sub-mount is automatically
> >>>>>> unmounted due
> >>>>>> to inactivity (after nfs_mountpoint_expiry_timeout), then it is
> >>>>>> mounted again via stat(). This can result in a short window
> >>>>>> during
> >>>>>> which client's lease will expire on a server but not on a
> >>>>>> client.
> >>>>>> This specific case was observed on production systems.
> >>>>>>
> >>>>>> This patch makes an explicit lease renewal instead of an
> >>>>>> implicit one,
> >>>>>> by adding RENEW to a compound operation issued by
> >>>>>> nfs4_do_fsinfo(),
> >>>>>> similarly to NFSv4.1 which adds SEQUENCE operation.
> >>>>>>
> >>>>>> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
> >>>>>> leases")
> >>>>>> Signed-off-by: Robert Milkowski <rmilkowski@xxxxxxxxx>
> >>>>>
> >>>>> Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >>>>>
> >>>>>
> >>>>
> >>>> How do we progress it further?
> >>>
> >>> Thanks for following up! I have the patch included in my linux-next
> >>> branch for
> >>> the next merge window.
> >>
> >> NACK. This is the wrong way to solve the problem. Lease renewal in
> >> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
> >> extra error condition that has absolutely nothing to do with the
> >> functionality of retrieving per-filesystem attributes.
> >
> > Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
> > send by fsinfo, and as Chuck pointed out
> > it makes sense to do similarly for 4.0, which is what this patch does.
>
> I did say that.
>
> However, I can see that for NFSv4.1+, the client code handling the
> SEQUENCE response will update cl_last_renewal. It does not need to
> be done in the fsinfo code.

I think it is the case, yes.

>
> The NFSv4.0 behavior should be correct if cl_last_renewal is not
> updated. That should force the client to send a separate RENEW
> operation so that both the client and server agree that the lease
> is active.
>

I was thinking about removing the call to update the last renewal
entirely in do_fsinfo(),
however as briefly discussed back in Dec there is an issue with
cl_last_renewal initialization on initial mount in 4.0
I observed it to be 0, as nfs4_setup_state_renewal() calls
nfs4_proc_get_lease_time() on initial mount and if no error it calls
nfs4_set_lease_period().
However with sec=krb the call to nfs4_proc_get_lease_time() returns
NFS4ERR_WRONGSEC) during initial mount (which seems to be ok), which
results in not setting cl_last_renewal,
which iirc prevented scheduling RENEW operations altogether. As
discussed then, this is really a separate issue which should be fixed
separately.
Once fixed then fsinfo() shouldn't need to set cl_last_renewal at all,
still sending RENEW in fsinfo() seems like a good idea to make it more
inline with what we do for 4.1.

> If I understand Trond correctly?
>

The problem with what Trond proposed is that it seems to go against
the rfc, although it should fix the initialization issue.

--
Robert Milkowski