Re: [PATCH v3 1/4] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap()

From: David Hildenbrand
Date: Mon Jul 17 2023 - 09:23:40 EST


On 17.07.23 15:13, Ryan Roberts wrote:
On 17/07/2023 14:00, David Hildenbrand wrote:
On 14.07.23 18:17, Ryan Roberts wrote:
In preparation for FLEXIBLE_THP support, improve
folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be
passed to it. In this case, all contained pages are accounted using the
order-0 folio (or base page) scheme.

Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
Reviewed-by: Yu Zhao <yuzhao@xxxxxxxxxx>
Reviewed-by: Yin Fengwei <fengwei.yin@xxxxxxxxx>
---
  mm/rmap.c | 28 +++++++++++++++++++++-------
  1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 0c0d8857dfce..f293d072368a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1278,31 +1278,45 @@ void page_add_anon_rmap(struct page *page, struct
vm_area_struct *vma,
   * This means the inc-and-test can be bypassed.
   * The folio does not have to be locked.
   *
- * If the folio is large, it is accounted as a THP.  As the folio
+ * If the folio is pmd-mappable, it is accounted as a THP.  As the folio
   * is new, it's assumed to be mapped exclusively by a single process.
   */
  void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
          unsigned long address)
  {
-    int nr;
+    int nr = folio_nr_pages(folio);

-    VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
+    VM_BUG_ON_VMA(address < vma->vm_start ||
+            address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
      __folio_set_swapbacked(folio);

-    if (likely(!folio_test_pmd_mappable(folio))) {
+    if (!folio_test_large(folio)) {

Why remove the "likely" here? The patch itself does not change anything about
that condition.

Good question; I'm not sure why. Will have to put it down to bad copy/paste
fixup. Will put it back in the next version.


          /* increment count (starts at -1) */
          atomic_set(&folio->_mapcount, 0);
-        nr = 1;
+        __page_set_anon_rmap(folio, &folio->page, vma, address, 1);
+    } else if (!folio_test_pmd_mappable(folio)) {
+        int i;
+
+        for (i = 0; i < nr; i++) {
+            struct page *page = folio_page(folio, i);
+
+            /* increment count (starts at -1) */
+            atomic_set(&page->_mapcount, 0);
+            __page_set_anon_rmap(folio, page, vma,
+                    address + (i << PAGE_SHIFT), 1);
+        }
+
+        /* increment count (starts at 0) */

That comment is a bit misleading. We're not talking about a mapcount as in the
other cases here.

Correct, I'm talking about _nr_pages_mapped, which starts 0, not -1 like
_mapcount. The comment was intended to be in the style used in other similar
places in rmap.c. I could change it to: "_nr_pages_mapped is 0-based, so set it
to the number of pages in the folio" or remove it entirely? What do you prefer?


We only have to comment what's weird, not what's normal.

IOW, we also didn't have such a comment in the existing code when doing atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED);


What might make sense here is a simple

"All pages of the folio are PTE-mapped."

--
Cheers,

David / dhildenb