Re: [PATCH v5 1/2] dax: Don't touch i_dio_count in dax_do_io()

From: Christoph Hellwig
Date: Thu May 05 2016 - 10:27:55 EST


On Thu, May 05, 2016 at 04:16:37PM +0200, Jan Kara wrote:
> We cannot easily do this currently - the reason is that in several places we
> wait for i_dio_count to drop to 0 (look for inode_dio_wait()) while
> holding i_mutex to wait for all outstanding DIO / DAX IO. You'd break this
> logic with this patch.
>
> If we indeed put all writes under i_mutex, this problem would go away but
> as Dave explains in his email, we consciously do as much IO as we can
> without i_mutex to allow reasonable scalability of multiple writers into
> the same file.

So the above should be fine for xfs, but you're telling me that ext4
is doing DAX I/O without any inode lock at all? In that case it's
indeed not going to work.

> The downside of that is that overwrites and writes vs reads are not atomic
> wrt each other as POSIX requires. It has been that way for direct IO in XFS
> case for a long time, with DAX this non-conforming behavior is proliferating
> more. I agree that's not ideal but serializing all writes on a file is
> rather harsh for persistent memory as well...

For non-O_DIRECT I/O it's simply required..