Re: [PATCH mm-unstable v5 01/10] mm: add folio dtor and order setter functions

From: Sidhartha Kumar
Date: Wed Dec 07 2022 - 14:06:43 EST


On 12/7/22 10:49 AM, Sidhartha Kumar wrote:
On 12/7/22 10:12 AM, Mike Kravetz wrote:
On 12/07/22 12:11, Muchun Song wrote:


On Dec 7, 2022, at 11:42, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:

On 12/07/22 11:34, Muchun Song wrote:


On Nov 30, 2022, at 06:50, Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx> wrote:

Add folio equivalents for set_compound_order() and set_compound_page_dtor().

Also remove extra new-lines introduced by mm/hugetlb: convert
move_hugetlb_state() to folios and mm/hugetlb_cgroup: convert
hugetlb_cgroup_uncharge_page() to folios.

Suggested-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Suggested-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>
---
include/linux/mm.h | 16 ++++++++++++++++
mm/hugetlb.c       |  4 +---
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a48c5ad16a5e..2bdef8a5298a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -972,6 +972,13 @@ static inline void set_compound_page_dtor(struct page *page,
page[1].compound_dtor = compound_dtor;
}

+static inline void folio_set_compound_dtor(struct folio *folio,
+ enum compound_dtor_id compound_dtor)
+{
+ VM_BUG_ON_FOLIO(compound_dtor >= NR_COMPOUND_DTORS, folio);
+ folio->_folio_dtor = compound_dtor;
+}
+
void destroy_large_folio(struct folio *folio);

static inline int head_compound_pincount(struct page *head)
@@ -987,6 +994,15 @@ static inline void set_compound_order(struct page *page, unsigned int order)
#endif
}

+static inline void folio_set_compound_order(struct folio *folio,
+ unsigned int order)
+{
+ folio->_folio_order = order;
+#ifdef CONFIG_64BIT
+ folio->_folio_nr_pages = order ? 1U << order : 0;

It seems that you think the user could pass 0 to order. However,
->_folio_nr_pages and ->_folio_order fields are invalid for order-0 pages.
You should not touch it. So this should be:

static inline void folio_set_compound_order(struct folio *folio,
     unsigned int order)
{
    if (!folio_test_large(folio))
        return;

    folio->_folio_order = order;
#ifdef CONFIG_64BIT
    folio->_folio_nr_pages = 1U << order;
#endif
}

I believe this was changed to accommodate the code in
__destroy_compound_gigantic_page().  It is used in a subsequent patch.
Here is the v6.0 version of the routine.

Thanks for your clarification.


static void __destroy_compound_gigantic_page(struct page *page,
unsigned int order, bool demote)
{
    int i;
    int nr_pages = 1 << order;
    struct page *p = page + 1;

    atomic_set(compound_mapcount_ptr(page), 0);
    atomic_set(compound_pincount_ptr(page), 0);

    for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
        p->mapping = NULL;
        clear_compound_head(p);
        if (!demote)
            set_page_refcounted(p);
    }

    set_compound_order(page, 0);
#ifdef CONFIG_64BIT
    page[1].compound_nr = 0;
#endif
    __ClearPageHead(page);
}


Might have been better to change this set_compound_order call to
folio_set_compound_order in this patch.


Agree. It has confused me a lot. I suggest changing the code to the
followings. The folio_test_large() check is still to avoid unexpected
users for OOB.

static inline void folio_set_compound_order(struct folio *folio,
                        unsigned int order)
{
    VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
    // or
    // if (!folio_test_large(folio))
    //     return;

    folio->_folio_order = order;
#ifdef CONFIG_64BIT
    folio->_folio_nr_pages = order ? 1U << order : 0;
#endif
}

I think the VM_BUG_ON_FOLIO is appropriate as it would at least flag
data corruption.

As Mike pointed out, my intention with supporting the 0 case was to cleanup the __destroy_compound_gigantic_page code by moving the ifdef CONFIG_64BIT lines to folio_set_compound_order(). I'll add the VM_BUG_ON_FOLIO line as well as a comment to make it clear it is not normally supported.

Thinking about this some more, it seems that hugetlb is the only caller
that abuses folio_set_compound_order (and previously set_compound_order)
by passing in a zero order.  Since it is unlikely that anyone knows of
this abuse, it might be good to add a comment to the routine to note
why it handles the zero case.  This might help prevent changes which
would potentially break hugetlb.

+/*
+ * _folio_nr_pages and _folio_order are invalid for
+ * order-zero pages. An exception is hugetlb, which passes
+ * in a zero order in __destroy_compound_gigantic_page().
+ */
 static inline void folio_set_compound_order(struct folio *folio,
                unsigned int order)
 {
+       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
+
        folio->_folio_order = order;
 #ifdef CONFIG_64BIT
        folio->_folio_nr_pages = order ? 1U << order : 0;

Does this comment work?



I will change the comment from referencing __destory_compound_gigantic_page()
to __destroy_compound_gigantic_folio, although __prep_compound_gigantic_folio() is another user of folio_set_compound_order(folio, 0). Should the sentence just be "An exception is hugetlb, which passes in a zero order"?