Re: [PATCH 1/3] fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock

From: Kirill Smelkov
Date: Fri Apr 12 2019 - 08:57:12 EST


On Thu, Apr 11, 2019 at 09:22:56AM -0700, Linus Torvalds wrote:
> On Thu, Apr 11, 2019 at 5:38 AM Kirill Smelkov <kirr@xxxxxxxxxx> wrote:
> >
> > However file->f_pos writing is still there and it will bug under race
> > detector, e.g. under KTSAN because read and write can be running
> > simultaneously. Maybe it is not only race bug, but also a bit of
> > slowdown as read and write code paths write to the same memory thus
> > needing inter-cpu synchronization if read and write handlers are on
> > different cpus. However for this I'm not sure.
>
> I doubt it's noticeable, but it would probably be good to simply not
> do the write for streams.
>
> That *could* be done somewhat similarly, with just changing th address
> to be updated, or course.
>
> In fact, we *used* to (long ago) pass in the address of "file->f_pos"
> itself to the low-level read/write routines. We then changed it to do
> that indirection through a local copy of pos (and
> file_pos_read/file_pos_write) because we didn't do the proper locking,
> so different read/write versions could mess with each other (and with
> lseek).
>
> But one of the things that commit 9c225f2655e36 ("vfs: atomic f_pos
> accesses as per POSIX") did was to add the proper locking at least for
> the cases that we care about deeply, so we *could* say that we have
> three cases:
>
> - FMODE_ATOMIC_POS: properly locked,
> - FMODE_STREAM: no pos at all
> - otherwise a "mostly don't care - don't mix!"
>
> and so we could go back to not copying the pos at all, and instead do
> something like
>
> loff_t *ppos = f.file->f_mode & FMODE_STREAM ? NULL : &file->f_pos;
> ret = vfs_write(f.file, buf, count, ppos);
>
> and perhaps have a long-term plan to try to get rid of the "don't mix"
> case entirely (ie "if you use f_pos, then we'll do the proper
> locking")
>
> (The above is obviously surrounded by the fdget_pos()/fdput_pos() that
> implements the locking decision).

Ok Linus, thanks for feedback. Please consider applying or queueing the
following patches.

If the patches are ok, the first one is probably better to be applied
now - to catch wrongly converted drivers right from start, while the
second one could be probably better delayed till merge window. However
please do as what you prefer is best.

Kirill

---- 8< ----