Re: [PATCH v2 00/11] DAX fsynx/msync support

From: Ross Zwisler
Date: Mon Nov 16 2015 - 15:01:16 EST


On Mon, Nov 16, 2015 at 08:58:11AM -0800, Dan Williams wrote:
> On Mon, Nov 16, 2015 at 6:41 AM, Jan Kara <jack@xxxxxxx> wrote:
> > On Fri 13-11-15 17:06:39, Ross Zwisler wrote:
> >> This patch series adds support for fsync/msync to DAX.
> >>
> >> Patches 1 through 7 add various utilities that the DAX code will eventually
> >> need, and the DAX code itself is added by patch 8. Patches 9-11 update the
> >> three filesystems that currently support DAX, ext2, ext4 and XFS, to use
> >> the new DAX fsync/msync code.
> >>
> >> These patches build on the recent DAX locking changes from Dave Chinner,
> >> Jan Kara and myself. Dave's changes for XFS and my changes for ext2 have
> >> been merged in the v4.4 window, but Jan's are still unmerged. You can grab
> >> them here:
> >>
> >> http://www.spinics.net/lists/linux-ext4/msg49951.html
> >
> > I had a quick look and the patches look sane to me. I'll try to give them
> > more detailed look later this week. When thinking about the general design
> > I was wondering: When we have this infrastructure to track data potentially
> > lingering in CPU caches, would not it be a performance win to use standard
> > cached stores in dax_io() and mark corresponding pages as dirty in page
> > cache the same way as this patch set does it for mmaped writes? I have no
> > idea how costly are non-temporal stores compared to cached ones and how
> > would this compare to the cost of dirty tracking so this may be just
> > completely bogus...
>
> Keep in mind that this approach will flush every virtual address that
> may be dirty. For example, if you touch 1byte in a 2MB page we'll end
> up looping through the entire 2MB range. At some point the dirty size
> becomes large enough that is cheaper to flush the entire cache, we
> have not measured where that crossover point is.

Yep, I expect there will be a crossover point where flushing the entire
processor cache will be beneficial. I agree with Dan that we'll need to
figure this out via measurement, and that we'd similarly need measurements to
justify the decision to write dirty data at the DAX level without flushing and
mark entries as dirty for fsync/msync to clean up later. It could turn out to
be great, but we'll have to see. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/