Re: [PATCH] xfs: fail dax mount if reflink is enabled on a partition

From: Shiyang Ruan
Date: Thu Sep 08 2022 - 09:46:29 EST




在 2022/8/4 8:51, Darrick J. Wong 写道:
On Wed, Aug 03, 2022 at 06:47:24AM +0000, ruansy.fnst@xxxxxxxxxxx wrote:

...


BTW, since these patches (dax&reflink&rmap + THIS + pmem-unbind) are
waiting to be merged, is it time to think about "removing the
experimental tag" again? :)

It's probably time to take up that question again.

Yesterday I tried running generic/470 (aka the MAP_SYNC test) and it
didn't succeed because it sets up dmlogwrites atop dmthinp atop pmem,
and at least one of those dm layers no longer allows fsdax pass-through,
so XFS silently turned mount -o dax into -o dax=never. :(

Hi Darrick,

I tried generic/470 but it didn't run:
[not run] Cannot use thin-pool devices on DAX capable block devices.

Did you modify the _require_dm_target() in common/rc? I added thin-pool
to not to check dax capability:

case $target in
stripe|linear|log-writes|thin-pool) # add thin-pool here
;;

then the case finally ran and it silently turned off dax as you said.

Are the steps for reproduction correct? If so, I will continue to
investigate this problem.

Ah, yes, I did add thin-pool to that case statement. Sorry I forgot to
mention that. I suspect that the removal of dm support for pmem is
going to force us to completely redesign this test. I can't really
think of how, though, since there's no good way that I know of to gain a
point-in-time snapshot of a pmem device.

Hi Darrick,

> removal of dm support for pmem
I think here we are saying about xfstest who removed the support, not
kernel?

I found some xfstests commits:
fc7b3903894a6213c765d64df91847f4460336a2 # common/rc: add the restriction.
fc5870da485aec0f9196a0f2bed32f73f6b2c664 # generic/470: use thin-pool

So, this case was never able to run since the second commit? (I didn't
notice the not run case. I thought it was expected to be not run.)

And according to the first commit, the restriction was added because
some of dm devices don't support dax. So my understanding is: we should
redesign the case to make the it work, and firstly, we should add dax
support for dm devices in kernel.

dm devices used to have fsdax support; I think Christoph is actively
removing (or already has removed) all that support.

In addition, is there any other testcase has the same problem? so that
we can deal with them together.

The last I checked, there aren't any that require MAP_SYNC or pmem aside
from g/470 and the three poison notification tests that you sent a few
days ago.

--D


Hi Darrick, Brian

I made a little investigation on generic/470.

This case was able to run before introducing thin-pool[1], but since that, it became 'Failed'/'Not Run' because thin-pool does not support DAX. I have checked the log of thin-pool, it never supports DAX. And, it's not someone has removed the fsdax support. So, I think it's not correct to bypass the requirement conditions by adding 'thin-pool' to _require_dm_target().

As far as I known, to prevent out-of-order replay of dm-log-write, thin-pool was introduced (to provide discard zeroing). Should we solve the 'out-of-order replay' issue instead of avoiding it by thin-pool? @Brian

Besides, since it's not a fsdax problem, I think there is nothing need to be fixed in fsdax. I'd like to help it solved, but I'm still wondering if we could back to the original topic("Remove Experimental Tag") firstly? :)


[1] fc5870da485aec0f9196a0f2bed32f73f6b2c664 generic/470: use thin volume for dmlogwrites target device


--
Thanks,
Ruan.