[PATCH 1/1] mm: fix the theoretical compound_lock() vsprep_new_page() race

From: Oleg Nesterov
Date: Thu Dec 19 2013 - 14:09:09 EST


get/put_page(thp_tail) paths do get_page_unless_zero(page_head) +
compound_lock(). In theory this page_head can be already freed and
reallocated as alloc_pages(__GFP_COMP, smaller_order). In this case
get_page_unless_zero() can succeed right after set_page_refcounted(),
and compound_lock() can race with the non-atomic __SetPageHead().

Perhaps we should rework the thp locking (under discussion), but
until then this patch moves set_page_refcounted() and adds wmb()
to ensure that page->_count != 0 comes as a last change.

I am not sure about other callers of set_page_refcounted(), but at
first glance they look fine to me.

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
---
mm/page_alloc.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 115b23b..9402337 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -865,8 +865,6 @@ static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
}

set_page_private(page, 0);
- set_page_refcounted(page);
-
arch_alloc_page(page, order);
kernel_map_pages(page, 1 << order, 1);

@@ -876,6 +874,14 @@ static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
if (order && (gfp_flags & __GFP_COMP))
prep_compound_page(page, order);

+ /*
+ * Make sure the caller of get_page_unless_zero() will see the
+ * fully initialized page. Say, to ensure that compound_lock()
+ * can't race with the non-atomic __SetPage*() above.
+ */
+ smp_wmb();
+ set_page_refcounted(page);
+
return 0;
}

--
1.5.5.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/