Re: [PATCH] nfs: set block size according to pnfs_blksize first

From: Trond Myklebust
Date: Wed Jun 16 2021 - 11:14:22 EST


On Wed, 2021-06-16 at 22:41 +0800, Gao Xiang wrote:
> Hi Trond,
>
> On Wed, Jun 16, 2021 at 02:20:49PM +0000, Trond Myklebust wrote:
> > On Wed, 2021-06-16 at 22:06 +0800, Gao Xiang wrote:
> > > On Wed, Jun 16, 2021 at 01:47:13PM +0000, Trond Myklebust wrote:
> > > > On Wed, 2021-06-16 at 20:44 +0800, Gao Xiang wrote:
> > > > > When testing fstests with ext4 over nfs 4.2, I found
> > > > > generic/486
> > > > > failed. The root cause is that the length of its xattr value is
> > > > >   min(st_blksize * 3 / 4, XATTR_SIZE_MAX)
> > > > >
> > > > > which is 4096 * 3 / 4 = 3072 for underlayfs ext4 rather than
> > > > > XATTR_SIZE_MAX = 65536 for nfs since the block size would be
> > > > > wsize
> > > > > (=131072) if bsize is not specified.
> > > > >
> > > > > Let's use pnfs_blksize first instead of using wsize directly if
> > > > > bsize isn't specified. And the testcase itself can pass now.
> > > > >
> > > > > Cc: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> > > > > Cc: Anna Schumaker <anna.schumaker@xxxxxxxxxx>
> > > > > Cc: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx>
> > > > > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx>
> > > > > ---
> > > > > Considering bsize is not specified, we might use pnfs_blksize
> > > > > directly first rather than wsize.
> > > > >
> > > > >  fs/nfs/super.c | 8 ++++++--
> > > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> > > > > index fe58525cfed4..5015edf0cd9a 100644
> > > > > --- a/fs/nfs/super.c
> > > > > +++ b/fs/nfs/super.c
> > > > > @@ -1068,9 +1068,13 @@ static void nfs_fill_super(struct
> > > > > super_block
> > > > > *sb, struct nfs_fs_context *ctx)
> > > > >         snprintf(sb->s_id, sizeof(sb->s_id),
> > > > >                  "%u:%u", MAJOR(sb->s_dev), MINOR(sb->s_dev));
> > > > >  
> > > > > -       if (sb->s_blocksize == 0)
> > > > > -               sb->s_blocksize = nfs_block_bits(server->wsize,
> > > > > +       if (sb->s_blocksize == 0) {
> > > > > +               unsigned int blksize = server->pnfs_blksize ?
> > > > > +                       server->pnfs_blksize : server->wsize;
> > > >
> > > > NACK. The pnfs block size is a layout driver-specific quantity,
> > > > and
> > > > should not be used to substitute for the server-advertised block
> > > > size.
> > > > It only applies to I/O _if_ the client is holding a layout for a
> > > > specific file and is using pNFS to do I/O to that file.
> > >
> > > Honestly, I'm not sure if it's ok as well.
> > >
> > > >
> > > > It has nothing to do with xattrs at all.
> > >
> > > Yet my question is how to deal with generic/486, should we just
> > > skip
> > > the case directly? I cannot find some proper way to get underlayfs
> > > block size or real xattr value limit.
> > >
> >
> > RFC8276 provides no method for determining the xattr size limits. It
> > just notes that such limits may exist, and provides the error code
> > NFS4ERR_XATTR2BIG, that the server may use as a return value when
> > those
> > limits are exceeded.
> >
> > > For now, generic/486 will return ENOSPC at
> > > fsetxattr(fd, "user.world", value, 65536, XATTR_REPLACE);
> > > when testing new nfs4.2 xattr support.
> > >
> >
> > As noted above, the NFS server should really be returning
> > NFS4ERR_XATTR2BIG in this case, which the client, again, should be
> > transforming into -E2BIG. Where does ENOSPC come from?
>
> Thanks for the detailed explanation...
>
> I think that is due to ext4 returning ENOSPC since I tested
>
> fsetxattr(fd, "user.world", value, 65536, XATTR_REPLACE);
> with ext4 as well and it returned ENOSPC, and I think it's reasonable
> since setxattr() will return ENOSPC for such cases.
> https://man7.org/linux/man-pages/man2/setxattr.2.html
>
> should we transform it to E2BIG instead (at least in NFS
> protocol)? but I'm still not sure that E2BIG is a valid return code for
> setxattr()...

The setxattr() manpage appears to suggest ERANGE is the correct return
value here.

ERANGE The size of name or value exceeds a filesystem-specific
limit.


However I can't tell if ext4 and xfs ever do that. Furthermore, it
looks as if the VFS is always returning E2BIG if size > XATTR_SIZE_MAX.

>
> If necessary, I will look into it more tomorrow....
>
> Thanks,
> Gao Xiang
>
> >
> > > Thanks,
> > > Gao Xiang
> > >
> > > >
> > > > > +
> > > > > +               sb->s_blocksize = nfs_block_bits(blksize,
> > > > >                                                  &sb-
> > > > > > s_blocksize_bits);
> > > > > +       }
> > > > >  
> > > > >         nfs_super_set_maxbytes(sb, server->maxfilesize);
> > > > >         server->has_sec_mnt_opts = ctx->has_sec_mnt_opts;
> >
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@xxxxxxxxxxxxxxx
> >
> >

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx