Re: [PATCH] sunrpc: account for xdr->page_base in xdr_alloc_bvec

From: Jeff Layton
Date: Mon Aug 14 2023 - 11:31:43 EST


On Mon, 2023-08-14 at 14:51 +0000, Trond Myklebust wrote:
> On Mon, 2023-08-14 at 10:32 -0400, Jeff Layton wrote:
> > I've been seeing a regression in mainline (v6.5-rc) kernels where
> > unaligned reads were returning corrupt data.
> >
> > 9d96acbc7f37 added a routine to allocate and populate a bvec array
> > that
> > can be used to back an iov_iter. When it does this, it always sets
> > the
> > offset in the first bvec to zero, even when the xdr->page_base is
> > non-zero.
> >
> > The old code in svc_tcp_sendmsg used to account for this, as it was
> > sending the pages one at a time anyway, but now that we just hand the
> > iov to the network layer, we need to ensure that the bvecs are
> > properly
> > initialized.
> >
> > Fix xdr_alloc_bvec to set the offset in the first bvec to the offset
> > indicated by xdr->page_base, and then 0 in all subsequent bvecs.
> >
> > Fixes: 9d96acbc7f37 ("SUNRPC: Add a bvec array to struct xdr_buf for
> > use with iovec_iter()")
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> > NB: This is only lightly tested so far, but it seems to fix the pynfs
> > regressions I've been seeing.
> > ---
> >  net/sunrpc/xdr.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> > index 2a22e78af116..d0f5fc8605b8 100644
> > --- a/net/sunrpc/xdr.c
> > +++ b/net/sunrpc/xdr.c
> > @@ -144,6 +144,7 @@ int
> >  xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
> >  {
> >         size_t i, n = xdr_buf_pagecount(buf);
> > +       unsigned int offset = offset_in_page(buf->page_base);
> >  
> >         if (n != 0 && buf->bvec == NULL) {
> >                 buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]),
> > gfp);
> > @@ -151,7 +152,8 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
> >                         return -ENOMEM;
> >                 for (i = 0; i < n; i++) {
> >                         bvec_set_page(&buf->bvec[i], buf->pages[i],
> > PAGE_SIZE,
> > -                                     0);
> > +                                     offset);
> > +                       offset = 0;
>
> NACK. That's going to break the client.
>

<rant>
What's the point of setting up a bvec array that doesn't actually
describe the usable data?
</rant>

Sigh, ok...I suppose we'll need to fix this in the svc callers instead.

> >                 }
> >         }
> >         return 0;
> >
> > ---
> > base-commit: 2ccdd1b13c591d306f0401d98dedc4bdcd02b421
> > change-id: 20230814-sendpage-b04874eed249
> >
> > Best regards,
>

--
Jeff Layton <jlayton@xxxxxxxxxx>