[PATCH V2] mm: Recheck page table entry with page table lock held

From: Aneesh Kumar K.V
Date: Tue Sep 25 2018 - 23:21:02 EST


We clear the pte temporarily during read/modify/write update of the pte. If we
take a page fault while the pte is cleared, the application can get SIGBUS. One
such case is with remap_pfn_range without a backing vm_ops->fault callback.
do_fault will return SIGBUS in that case.

cpu 0 cpu1
mprotect()
ptep_modify_prot_start()/pte cleared.
.
. page fault.
.
.
prep_modify_prot_commit()

Fix this by taking page table lock and rechecking for pte_none.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
---
V1:
* update commit message.

mm/memory.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index c467102a5cbc..c2f933184303 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3745,10 +3745,33 @@ static vm_fault_t do_fault(struct vm_fault *vmf)
struct vm_area_struct *vma = vmf->vma;
vm_fault_t ret;

- /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */
- if (!vma->vm_ops->fault)
- ret = VM_FAULT_SIGBUS;
- else if (!(vmf->flags & FAULT_FLAG_WRITE))
+ /*
+ * The VMA was not fully populated on mmap() or missing VM_DONTEXPAND
+ */
+ if (!vma->vm_ops->fault) {
+
+ /*
+ * pmd entries won't be marked none during a R/M/W cycle.
+ */
+ if (unlikely(pmd_none(*vmf->pmd)))
+ ret = VM_FAULT_SIGBUS;
+ else {
+ vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+ /*
+ * Make sure this is not a temporary clearing of pte
+ * by holding ptl and checking again. A R/M/W update
+ * of pte involves: take ptl, clearing the pte so that
+ * we don't have concurrent modification by hardware
+ * followed by an update.
+ */
+ spin_lock(vmf->ptl);
+ if (unlikely(pte_none(*vmf->pte)))
+ ret = VM_FAULT_SIGBUS;
+ else
+ ret = VM_FAULT_NOPAGE;
+ spin_unlock(vmf->ptl);
+ }
+ } else if (!(vmf->flags & FAULT_FLAG_WRITE))
ret = do_read_fault(vmf);
else if (!(vma->vm_flags & VM_SHARED))
ret = do_cow_fault(vmf);
--
2.17.1