Re: [PATCH v5 40/40] 9p: Use netfslib read/write_iter

From: Dominique Martinet
Date: Wed Jan 03 2024 - 08:00:52 EST


David Howells wrote on Wed, Jan 03, 2024 at 12:39:34PM +0000:
> > p9_client_write return value should always be subreq->len, but I believe
> > we should use it unless err is set.
> > (It's also possible for partial writes to happen, e.g. p9_client_write
> > looped a few times and then failed, at which point the size returned
> > would be the amount that actually got through -- we probably should do
> > something with that?)
>
> How about something like:
>
> - int err;
> + int err, len;
>
> trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
> - p9_client_write(fid, subreq->start, &subreq->io_iter, &err);
> - netfs_write_subrequest_terminated(subreq, err < 0 ? err : subreq->len,
> - false);
> + len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err);
> + netfs_write_subrequest_terminated(subreq, len ?: err, false);

I think that'll be fine; plain write() syscall works like this when an
error happens after some data has been flushed, and I assume there'll be
some retry if this happpened on something like a flush dirty and it got
a partial write reported?

> > > + if (file) {
> > > + fid = file->private_data;
> > > + BUG_ON(!fid);
> >
> > This probably should be WARN + return EINVAL like find by inode?
> > It's certainly a huge problem, but we should avoid BUG if possible...
>
> Sure. The BUG_ON() was already there, but I can turn it into a WARN+error.

Thanks.

> > nit: not sure what's cleaner?
> > Since there's a message that makes for a bit awkward if...
> >
> > if (WARN_ONCE(!fid, "folio expected an open fid inode->i_private=%p\n",
> > rreq->inode->i_private))
> > return -EINVAL;
> >
> > (as a side note, I'm not sure what to make of this i_private pointer
> > here, but if that'll help you figure something out sure..)
>
> Um. 9p is using i_private. But perhaps i_ino would be a better choice:
>
> if (file) {
> fid = file->private_data;
> if (!fid)
> goto no_fid;
> p9_fid_get(fid);
> } else {
> fid = v9fs_fid_find_inode(rreq->inode, writing, INVALID_UID, true);
> if (!fid)
> goto no_fid;
> }
>
> ...
>
> no_fid:
> WARN_ONCE(1, "folio expected an open fid inode->i_ino=%lx\n",
> rreq->inode->i_ino);
> return -EINVAL;

Might be useful to track down if this came frm a file without private
data or lookup failing, but given this was a bug I guess we can deal
with that when that happens -- ack.

> > This is as follow on your netfs-lib branch:
> > - WARN_ON(rreq->origin == NETFS_READ_FOR_WRITE &&
> > - !(fid->mode & P9_ORDWR));
> > -
> > - p9_fid_get(fid);
> > + WARN_ON(rreq->origin == NETFS_READ_FOR_WRITE && !(fid->mode & P9_ORDWR));
> >
> > So the WARN_ON has been reverted back with only indentation changed;
> > I guess there were patterns that were writing despite the fid not having
> > been open as RDWR?
> > Do you still have details about these?
>
> The condition in the WARN_ON() here got changed. It was:
>
> WARN_ON(writing && ...
>
> at one point, but that caused a bunch of incorrect warning to appear because
> only NETFS_READ_FOR_WRITE requires read-access as well as write-access. All
> the others:
>
> bool writing = (rreq->origin == NETFS_READ_FOR_WRITE ||
> rreq->origin == NETFS_WRITEBACK ||
> rreq->origin == NETFS_WRITETHROUGH ||
> rreq->origin == NETFS_LAUNDER_WRITE ||
> rreq->origin == NETFS_UNBUFFERED_WRITE ||
> rreq->origin == NETFS_DIO_WRITE);
>
> only require write-access.

Thanks for clarifying

> > If a file has been open without the write bit it might not go through,
> > and it's incredibly difficult to get such users back to userspace in
> > async cases (e.g. mmap flushes), so would like to understand that.
>
> The VFS/VM should prevent writing to files that aren't open O_WRONLY or
> O_RDWR, so I don't think we should be called in otherwise.

Historically this check was more about finding a fid that wasn't opened
properly than the VFS doing something weird (e.g. by calling mprotect
after mmap and us missing that -- would need to check if that works
actually...)

> > > + return netfs_page_mkwrite(vmf, NULL);
> >
> > (I guess there's no helper that could be used directly in .page_mkwrite
> > op?)
>
> I could provide a helper that just supplies NULL as the second argument. I
> think only 9p will use it, but that's fine.

If we're the only user I guess we shouldn't bother with it at this
point, we can come back to it if this ever becomes common.

--
Dominique Martinet | Asmadeus