[RFC PATCH v3 04/11] mseal: add MM_SEAL_BASE

From: jeffxu
Date: Tue Dec 12 2023 - 18:17:35 EST


From: Jeff Xu <jeffxu@xxxxxxxxxxxx>

The base package includes the features common to all VMA sealing
types. It prevents sealed VMAs from:
1> Unmapping, moving to another location, and shrinking the size, via
munmap() and mremap(), can leave an empty space, therefore can be
replaced with a VMA with a new set of attributes.
2> Move or expand a different vma into the current location, via mremap().
3> Modifying sealed VMA via mmap(MAP_FIXED).
4> Size expansion, via mremap(), does not appear to pose any specific
risks to sealed VMAs. It is included anyway because the use case is
unclear. In any case, users can rely on merging to expand a sealed
VMA.

We consider the MM_SEAL_BASE feature, on which other sealing features
will depend. For instance, it probably does not make sense to seal
PROT_PKEY without sealing the BASE, and the kernel will implicitly add
SEAL_BASE for SEAL_PROT_PKEY. (If the application wants to relax this
in future, we could use the flags field in mseal() to overwrite
this the behavior of implicitly adding SEAL_BASE.)

Signed-off-by: Jeff Xu <jeffxu@xxxxxxxxxxxx>
---
mm/mmap.c | 23 +++++++++++++++++++++++
mm/mremap.c | 42 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index 42462c2a0c35..dbc557bd460c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1259,6 +1259,13 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
return -EEXIST;
}

+ /*
+ * Check if the address range is sealed for do_mmap().
+ * can_modify_mm assumes we have acquired the lock on MM.
+ */
+ if (!can_modify_mm(mm, addr, addr + len, MM_SEAL_BASE))
+ return -EACCES;
+
if (prot == PROT_EXEC) {
pkey = execute_only_pkey(mm);
if (pkey < 0)
@@ -2632,6 +2639,14 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
if (end == start)
return -EINVAL;

+ /*
+ * Check if memory is sealed before arch_unmap.
+ * Prevent unmapping a sealed VMA.
+ * can_modify_mm assumes we have acquired the lock on MM.
+ */
+ if (!can_modify_mm(mm, start, end, MM_SEAL_BASE))
+ return -EACCES;
+
/* arch_unmap() might do unmaps itself. */
arch_unmap(mm, start, end);

@@ -3053,6 +3068,14 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
{
struct mm_struct *mm = vma->vm_mm;

+ /*
+ * Check if memory is sealed before arch_unmap.
+ * Prevent unmapping a sealed VMA.
+ * can_modify_mm assumes we have acquired the lock on MM.
+ */
+ if (!can_modify_mm(mm, start, end, MM_SEAL_BASE))
+ return -EACCES;
+
arch_unmap(mm, start, end);
return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
}
diff --git a/mm/mremap.c b/mm/mremap.c
index 382e81c33fc4..ff7429bfbbe1 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -835,7 +835,35 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len,
if ((mm->map_count + 2) >= sysctl_max_map_count - 3)
return -ENOMEM;

+ /*
+ * In mremap_to() which moves a VMA to another address.
+ * Check if src address is sealed, if so, reject.
+ * In other words, prevent a sealed VMA being moved to
+ * another address.
+ *
+ * Place can_modify_mm here because mremap_to()
+ * does its own checking for address range, and we only
+ * check the sealing after passing those checks.
+ * can_modify_mm assumes we have acquired the lock on MM.
+ */
+ if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_BASE))
+ return -EACCES;
+
if (flags & MREMAP_FIXED) {
+ /*
+ * In mremap_to() which moves a VMA to another address.
+ * Check if dst address is sealed, if so, reject.
+ * In other words, prevent moving a vma to a sealed VMA.
+ *
+ * Place can_modify_mm here because mremap_to() does its
+ * own checking for address, and we only check the sealing
+ * after passing those checks.
+ * can_modify_mm assumes we have acquired the lock on MM.
+ */
+ if (!can_modify_mm(mm, new_addr, new_addr + new_len,
+ MM_SEAL_BASE))
+ return -EACCES;
+
ret = do_munmap(mm, new_addr, new_len, uf_unmap_early);
if (ret)
goto out;
@@ -994,6 +1022,20 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
goto out;
}

+ /*
+ * This is shrink/expand case (not mremap_to())
+ * Check if src address is sealed, if so, reject.
+ * In other words, prevent shrinking or expanding a sealed VMA.
+ *
+ * Place can_modify_mm here so we can keep the logic related to
+ * shrink/expand together. Perhaps we can extract below to be its
+ * own function in future.
+ */
+ if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_BASE)) {
+ ret = -EACCES;
+ goto out;
+ }
+
/*
* Always allow a shrinking remap: that just unmaps
* the unnecessary pages..
--
2.43.0.472.g3155946c3a-goog