RE: [PATCH v2 0/8] fsdax,xfs: fix warning messages

From: Dan Williams
Date: Fri Dec 02 2022 - 20:21:51 EST


Shiyang Ruan wrote:
> Changes since v1:
> 1. Added a snippet of the warning message and some of the failed cases
> 2. Separated the patch for easily review
> 3. Added page->share and its helper functions
> 4. Included the patch[1] that removes the restrictions of fsdax and reflink
> [1] https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@xxxxxxxxxxx/
>
> Many testcases failed in dax+reflink mode with warning message in dmesg.
> Such as generic/051,075,127. The warning message is like this:
> [ 775.509337] ------------[ cut here ]------------
> [ 775.509636] WARNING: CPU: 1 PID: 16815 at fs/dax.c:386 dax_insert_entry.cold+0x2e/0x69
> [ 775.510151] Modules linked in: auth_rpcgss oid_registry nfsv4 algif_hash af_alg af_packet nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter ip_tables x_tables dax_pmem nd_pmem nd_btt sch_fq_codel configfs xfs libcrc32c fuse
> [ 775.524288] CPU: 1 PID: 16815 Comm: fsx Kdump: loaded Tainted: G W 6.1.0-rc4+ #164 eb34e4ee4200c7cbbb47de2b1892c5a3e027fd6d
> [ 775.524904] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.0-3-3 04/01/2014
> [ 775.525460] RIP: 0010:dax_insert_entry.cold+0x2e/0x69
> [ 775.525797] Code: c7 c7 18 eb e0 81 48 89 4c 24 20 48 89 54 24 10 e8 73 6d ff ff 48 83 7d 18 00 48 8b 54 24 10 48 8b 4c 24 20 0f 84 e3 e9 b9 ff <0f> 0b e9 dc e9 b9 ff 48 c7 c6 a0 20 c3 81 48 c7 c7 f0 ea e0 81 48
> [ 775.526708] RSP: 0000:ffffc90001d57b30 EFLAGS: 00010082
> [ 775.527042] RAX: 000000000000002a RBX: 0000000000000000 RCX: 0000000000000042
> [ 775.527396] RDX: ffffea000a0f6c80 RSI: ffffffff81dfab1b RDI: 00000000ffffffff
> [ 775.527819] RBP: ffffea000a0f6c40 R08: 0000000000000000 R09: ffffffff820625e0
> [ 775.528241] R10: ffffc90001d579d8 R11: ffffffff820d2628 R12: ffff88815fc98320
> [ 775.528598] R13: ffffc90001d57c18 R14: 0000000000000000 R15: 0000000000000001
> [ 775.528997] FS: 00007f39fc75d740(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000
> [ 775.529474] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 775.529800] CR2: 00007f39fc772040 CR3: 0000000107eb6001 CR4: 00000000003706e0
> [ 775.530214] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 775.530592] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 775.531002] Call Trace:
> [ 775.531230] <TASK>
> [ 775.531444] dax_fault_iter+0x267/0x6c0
> [ 775.531719] dax_iomap_pte_fault+0x198/0x3d0
> [ 775.532002] __xfs_filemap_fault+0x24a/0x2d0 [xfs aa8d25411432b306d9554da38096f4ebb86bdfe7]
> [ 775.532603] __do_fault+0x30/0x1e0
> [ 775.532903] do_fault+0x314/0x6c0
> [ 775.533166] __handle_mm_fault+0x646/0x1250
> [ 775.533480] handle_mm_fault+0xc1/0x230
> [ 775.533810] do_user_addr_fault+0x1ac/0x610
> [ 775.534110] exc_page_fault+0x63/0x140
> [ 775.534389] asm_exc_page_fault+0x22/0x30
> [ 775.534678] RIP: 0033:0x7f39fc55820a
> [ 775.534950] Code: 00 01 00 00 00 74 99 83 f9 c0 0f 87 7b fe ff ff c5 fe 6f 4e 20 48 29 fe 48 83 c7 3f 49 8d 0c 10 48 83 e7 c0 48 01 fe 48 29 f9 <f3> a4 c4 c1 7e 7f 00 c4 c1 7e 7f 48 20 c5 f8 77 c3 0f 1f 44 00 00
> [ 775.535839] RSP: 002b:00007ffc66a08118 EFLAGS: 00010202
> [ 775.536157] RAX: 00007f39fc772001 RBX: 0000000000042001 RCX: 00000000000063c1
> [ 775.536537] RDX: 0000000000006400 RSI: 00007f39fac42050 RDI: 00007f39fc772040
> [ 775.536919] RBP: 0000000000006400 R08: 00007f39fc772001 R09: 0000000000042000
> [ 775.537304] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
> [ 775.537694] R13: 00007f39fc772000 R14: 0000000000006401 R15: 0000000000000003
> [ 775.538086] </TASK>
> [ 775.538333] ---[ end trace 0000000000000000 ]---
>
> This also effects dax+noreflink mode if we run the test after a
> dax+reflink test. So, the most urgent thing is solving the warning
> messages.
>
> With these fixes, most warning messages in dax_associate_entry() are
> gone. But honestly, generic/388 will randomly failed with the warning.
> The case shutdown the xfs when fsstress is running, and do it for many
> times. I think the reason is that dax pages in use are not able to be
> invalidated in time when fs is shutdown. The next time dax page to be
> associated, it still remains the mapping value set last time. I'll keep
> on solving it.

This one also sounds like it is going to be relevant for CXL PMEM, and
the improvements to the reference counting. CXL has a facility where the
driver asserts that no more writes are in-flight to the device so that
the device can assert a clean shutdown. Part of that will be making sure
that page access ends at fs shutdown.

> The warning message in dax_writeback_one() can also be fixed because of
> the dax unshare.
>
>
> Shiyang Ruan (8):
> fsdax: introduce page->share for fsdax in reflink mode
> fsdax: invalidate pages when CoW
> fsdax: zero the edges if source is HOLE or UNWRITTEN
> fsdax,xfs: set the shared flag when file extent is shared
> fsdax: dedupe: iter two files at the same time
> xfs: use dax ops for zero and truncate in fsdax mode
> fsdax,xfs: port unshare to fsdax
> xfs: remove restrictions for fsdax and reflink
>
> fs/dax.c | 220 +++++++++++++++++++++++++------------
> fs/xfs/xfs_ioctl.c | 4 -
> fs/xfs/xfs_iomap.c | 6 +-
> fs/xfs/xfs_iops.c | 4 -
> fs/xfs/xfs_reflink.c | 8 +-
> include/linux/dax.h | 2 +
> include/linux/mm_types.h | 5 +-
> include/linux/page-flags.h | 2 +-
> 8 files changed, 166 insertions(+), 85 deletions(-)
>
> --
> 2.38.1
>
>