Re: [PATCH] nfs: decrement nrequests counter before releasing the req

From: Benjamin Coddington
Date: Mon Oct 02 2023 - 16:04:45 EST


On 28 Sep 2023, at 12:23, Anna Schumaker wrote:
>>
>> On 28 Sep 2023, at 9:04, Benjamin Coddington wrote:
>>>
>>> I think your patch on July 25th fixes a real bug based on that vmcore we
>>> have, but it's a pretty crazy race. I'll try again to reproduce it.
>>>
>>> Ben
>
> Okay, I'll plan on taking this patch for 6.6-rc now and if we
> determine there is still a bug we can add an additional patch on top.

I can reproduce the crash Scott reported in July:
https://lore.kernel.org/linux-nfs/20230725150807.8770-1-smayhew@xxxxxxxxxx/

I used a three-way race between lock(), write(), and syncfs() with a
mapped file and no delegation, and I can reliably hit it by inserting a few
small delays into the kernel.

Scott's patch fixes it and his analysis is correct. I can resend that patch
with more information, or provide more details, if needed. Basically the
issue is that the lock/unlock path sets NFS_INO_INVALID_DATA, the write path
invalidates on that, and syncfs() (or a rarer writeback flush) swoops in and
dereferences the null folio->mapping.

I think Jeff's fix here is also correct, both problems need a fix.

Ben

(Splat included because.. it was hard to make it.)

[ 966.743865] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 966.743879] Mem abort info:
[ 966.743884] ESR = 0x0000000096000004
[ 966.743890] EC = 0x25: DABT (current EL), IL = 32 bits
[ 966.743898] SET = 0, FnV = 0
[ 966.743904] EA = 0, S1PTW = 0
[ 966.743910] FSC = 0x04: level 0 translation fault
[ 966.743918] Data abort info:
[ 966.743923] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 966.743931] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 966.743938] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 966.743946] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100a03000
[ 966.743955] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 966.743965] Internal error: Oops: 0000000096000004 [#1] SMP
[ 966.743974] Modules linked in: nfsv4(OE) nfs(OE) nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss dns_resolver lockd grace fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock sunrpc vfat fat snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep uvcvideo snd_seq uvc videobuf2_vmalloc snd_seq_device videobuf2_memops videobuf2_v4l2 snd_pcm videobuf2_common videodev snd_timer snd mc joydev soundcore vmw_vmci loop zram xfs crct10dif_ce polyval_ce polyval_generic ghash_ce sha3_ce sha512_ce vmwgfx sha512_arm64 e1000e nvme nvme_core nvme_common drm_ttm_helper ttm uhci_hcd scsi_dh_rdac scsi_dh_emc scsi_dh_alua fuse dm_multipath
[ 966.744003] Unloaded tainted modules: nfs(OE):1 nfsv4(OE):1 [last unloaded: nfs(OE)]
[ 966.744060] CPU: 6 PID: 3666 Comm: kworker/u27:0 Tainted: G OE 6.6.0-rc1.nks #191
[ 966.744307] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.20904234.BA64.2212051119 12/05/2022
[ 966.744504] Workqueue: nfsiod rpc_async_release [sunrpc]
[ 966.744725] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 966.744915] pc : nfs_inode_remove_request+0xc0/0x220 [nfs]
[ 966.745123] lr : nfs_inode_remove_request+0x9c/0x220 [nfs]
[ 966.745323] sp : ffff8000848fbc90
[ 966.745499] x29: ffff8000848fbc90 x28: 0000000000000000 x27: ffff8000823aec50
[ 966.745673] x26: ffff800081c0e008 x25: ffff0000c7afe2d0 x24: ffff0000c7afe2e0
[ 966.745849] x23: 0000000000000000 x22: ffff80007b92bea8 x21: ffff0000cc137d00
[ 966.746021] x20: 0000000000418958 x19: 0000000000000000 x18: 0000000000000014
[ 966.746193] x17: 1d00000000000000 x16: 4100000000000000 x15: 00000003989a4603
[ 966.746365] x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101
[ 966.746536] x11: 7f7f7f7f7f7f7f7f x10: fefefefefefefeff x9 : ffff80007bb02f4c
[ 966.746708] x8 : ffff8000848fbbd0 x7 : 0000000000000000 x6 : 0000000000000000
[ 966.746881] x5 : 0000000000000800 x4 : ffff8000848fbc70 x3 : ffff8000848fbc90
[ 966.747052] x2 : 0000000000000002 x1 : 002fffff00000128 x0 : 0000000000000000
[ 966.747222] Call trace:
[ 966.747390] nfs_inode_remove_request+0xc0/0x220 [nfs]
[ 966.747574] nfs_commit_release_pages+0x218/0x338 [nfs]
[ 966.747756] nfs_commit_release+0x28/0x58 [nfs]
[ 966.747934] rpc_free_task+0x3c/0x78 [sunrpc]
[ 966.748128] rpc_async_release+0x34/0x60 [sunrpc]
[ 966.748315] process_one_work+0x170/0x3d8
[ 966.748482] worker_thread+0x2bc/0x3e0
[ 966.748648] kthread+0xf8/0x110
[ 966.748813] ret_from_fork+0x10/0x20
[ 966.748977] Code: 36980741 f9400001 36600701 952221f7 (f9400000)
[ 966.749143] ---[ end trace 0000000000000000 ]---