Re: [PATCH v8 1/3] mm/mmap: separate writenotify and dirty tracking logic

From: David Hildenbrand
Date: Wed May 03 2023 - 10:32:32 EST


On 03.05.23 00:51, Lorenzo Stoakes wrote:
vma_wants_writenotify() is specifically intended for setting PTE page table
flags, accounting for existing page table flag state and whether the
filesystem performs dirty tracking.

Separate out the notions of dirty tracking and PTE write notify checking in
order that we can invoke the dirty tracking check from elsewhere.

Note that this change introduces a very small duplicate check of the
separated out vm_ops_needs_writenotify() and vma_is_shared_writable()
functions. This is necessary to avoid making vma_needs_dirty_tracking()
needlessly complicated (e.g. passing flags or having it assume checks were
already performed). This is small enough that it doesn't seem too
egregious.

We check to ensure the mapping is shared writable, as any GUP caller will
be safe - MAP_PRIVATE mappings will be CoW'd and read-only file-backed
shared mappings are not permitted access, even with FOLL_FORCE.

Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
Reviewed-by: John Hubbard <jhubbard@xxxxxxxxxx>
Reviewed-by: Mika Penttilä <mpenttil@xxxxxxxxxx>
Reviewed-by: Jan Kara <jack@xxxxxxx>
Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
---
include/linux/mm.h | 1 +
mm/mmap.c | 53 ++++++++++++++++++++++++++++++++++------------
2 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 27ce77080c79..7b1d4e7393ef 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2422,6 +2422,7 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma,
#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \
MM_CP_UFFD_WP_RESOLVE)
+bool vma_needs_dirty_tracking(struct vm_area_struct *vma);
int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot);
static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma)
{
diff --git a/mm/mmap.c b/mm/mmap.c
index 5522130ae606..fa7442e44cc2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1475,6 +1475,42 @@ SYSCALL_DEFINE1(old_mmap, struct mmap_arg_struct __user *, arg)
}
#endif /* __ARCH_WANT_SYS_OLD_MMAP */
+/* Do VMA operations imply write notify is required? */

Nit: comment is superfluous, this is already self-documenting code.

+static bool vm_ops_needs_writenotify(const struct vm_operations_struct *vm_ops)
+{
+ return vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite);
+}
+
+/* Is this VMA shared and writable? */

Nit: dito

+static bool vma_is_shared_writable(struct vm_area_struct *vma)
+{
+ return (vma->vm_flags & (VM_WRITE | VM_SHARED)) ==
+ (VM_WRITE | VM_SHARED);
+}
+
+/*
+ * Does this VMA require the underlying folios to have their dirty state
+ * tracked?
+ */

Nit: dito

+bool vma_needs_dirty_tracking(struct vm_area_struct *vma)
+{
+ /* Only shared, writable VMAs require dirty tracking. */
+ if (!vma_is_shared_writable(vma))
+ return false;
+
+ /* Does the filesystem need to be notified? */
+ if (vm_ops_needs_writenotify(vma->vm_ops))
+ return true;
+
+ /* Specialty mapping? */
+ if (vma->vm_flags & VM_PFNMAP)
+ return false;
+
+ /* Can the mapping track the dirty pages? */
+ return vma->vm_file && vma->vm_file->f_mapping &&
+ mapping_can_writeback(vma->vm_file->f_mapping);
+}
+
/*
* Some shared mappings will want the pages marked read-only
* to track write events. If so, we'll downgrade vm_page_prot
@@ -1483,21 +1519,18 @@ SYSCALL_DEFINE1(old_mmap, struct mmap_arg_struct __user *, arg)
*/
int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
{
- vm_flags_t vm_flags = vma->vm_flags;
- const struct vm_operations_struct *vm_ops = vma->vm_ops;
-
/* If it was private or non-writable, the write bit is already clear */
- if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED)))
+ if (!vma_is_shared_writable(vma))
return 0;
/* The backer wishes to know when pages are first written to? */
- if (vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite))
+ if (vm_ops_needs_writenotify(vma->vm_ops))
return 1;
/* The open routine did something to the protections that pgprot_modify
* won't preserve? */
if (pgprot_val(vm_page_prot) !=
- pgprot_val(vm_pgprot_modify(vm_page_prot, vm_flags)))
+ pgprot_val(vm_pgprot_modify(vm_page_prot, vma->vm_flags)))
return 0;
/*
@@ -1511,13 +1544,7 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
if (userfaultfd_wp(vma))
return 1;
- /* Specialty mapping? */
- if (vm_flags & VM_PFNMAP)
- return 0;
-
- /* Can the mapping track the dirty pages? */
- return vma->vm_file && vma->vm_file->f_mapping &&
- mapping_can_writeback(vma->vm_file->f_mapping);
+ return vma_needs_dirty_tracking(vma);
}
/*

We now have duplicate vma_is_shared_writable() and vm_ops_needs_writenotify() checks ...


Maybe move the VM_PFNMAP and "/* Can the mapping track the dirty pages? */" checks into a separate helper and call that from both, vma_wants_writenotify() and vma_needs_dirty_tracking() ?


In any case

Acked-by: David Hildenbrand <david@xxxxxxxxxx>

--
Thanks,

David / dhildenb