[PATCH v9 0/8] Userspace P2PDMA with O_DIRECT NVMe devices

From: Logan Gunthorpe
Date: Thu Aug 25 2022 - 11:25:31 EST


Hi,

This is the latest P2PDMA userspace patch set. Since the last full
posting[1] the first half of the series[2] has made it into v6.0-rc1.

This version of the patchset also switches to using a sysfs binary
file instead of mmapping the nvme char device directly. This
removes the need for the anonymous inode as well as the numerous
hooks into the nvme driver. The file to mmap will be found in
/sys/<pci_device>/p2pmem/allocate. The latest version of this patch
set is much smaller as a result of these simplifications.

This patch set enables userspace P2PDMA by allowing userspace to mmap()
allocated chunks of the CMB. The resulting VMA can be passed only
to O_DIRECT IO on NVMe backed files or block devices. A flag is added
to GUP() in Patch 1, then Patches 2 through 6 wire this flag up based
on whether the block queue indicates P2PDMA support. Patches 7
creates the sysfs resource that can hand out the VMAs and Patch 8
adds brief documentation for the new interface.

Feedback welcome.

This series is based on v6.0-rc2. A git branch is available here:

https://github.com/sbates130272/linux-p2pmem/ p2pdma_user_cmb_v9

Thanks,

Logan

[1] https://lkml.kernel.org/r/20220615161233.17527-1-logang@xxxxxxxxxxxx
[2] https://lkml.kernel.org/r/20220708165104.5005-1-logang@xxxxxxxxxxxx

--

Changes since v7:
- Rebase onto v6.0-rc2, included reworking the iov_iter patch
due to changes there
- Drop the char device mmap implementation in favour of a sysfs
based interface. (per Christoph)

Changes since v6:
- Rebase onto v5.19-rc1
- Rework how the pages are stored in the VMA per Jason's suggestion

Changes since v5:
- Rebased onto v5.18-rc1 which includes Christophs cleanup to
free_zone_device_page() (similar to Ralph's patch).
- Fix bug with concurrent first calls to pci_p2pdma_vma_fault()
that caused a double allocation and lost p2p memory. Noticed
by Andrew Maier.
- Collected a Reviewed-by tag from Chaitanya.
- Numerous minor fixes to commit messages


--

Logan Gunthorpe (8):
mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages
iov_iter: introduce iov_iter_get_pages_[alloc_]flags()
block: add check when merging zone device pages
lib/scatterlist: add check when merging zone device pages
block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages()
block: set FOLL_PCI_P2PDMA in bio_map_user_iov()
PCI/P2PDMA: Allow userspace VMA allocations through sysfs
ABI: sysfs-bus-pci: add documentation for p2pmem allocate

Documentation/ABI/testing/sysfs-bus-pci | 12 ++-
block/bio.c | 12 ++-
block/blk-map.c | 7 +-
drivers/pci/p2pdma.c | 124 ++++++++++++++++++++++++
include/linux/mm.h | 1 +
include/linux/mmzone.h | 24 +++++
include/linux/uio.h | 6 ++
lib/iov_iter.c | 40 +++++++-
lib/scatterlist.c | 25 +++--
mm/gup.c | 22 ++++-
10 files changed, 254 insertions(+), 19 deletions(-)


base-commit: 1c23f9e627a7b412978b4e852793c5e3c3efc555
--
2.30.2