[PATCH] mm/hugetlb: revert use of page_cache_next_miss()

From: Sidhartha Kumar
Date: Fri May 05 2023 - 14:54:10 EST


As reported by Ackerley[1], the use of page_cache_next_miss() in
hugetlbfs_fallocate() introduces a bug where a second fallocate() call to
same offset fails with -EEXIST. Revert this change and go back to the
previous method of using get from the page cache and then dropping the
reference on success.

hugetlbfs_pagecache_present() was also refactored to use
page_cache_next_miss(), revert the usage there as well.

User visible impacts include hugetlb fallocate incorrectly returning
EEXIST if pages are already present in the file. In addition, hugetlb
pages will not be included in core dumps if they need to be brought in via
GUP. userfaultfd UFFDIO_COPY also uses this code and will not notice pages
already present in the cache. It may try to allocate a new page and
potentially return ENOMEM as opposed to EEXIST.

Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
Cc: <stable@xxxxxxxxxxxxxxx> #v6.3+
Reported-by: Ackerley Tng <ackerleytng@xxxxxxxxxx>
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>

[1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@xxxxxxxxxx/
---
This patch is meant to fix stable v6.3.1 as safe as possible by doing a
simple revert.

Patch page cache: fix page_cache_next/prev_miss off by one by Mike is a
potential fix that will allow the use of page_cache_next_miss() and is
awaiting review.

Patch Fix fallocate error in hugetlbfs when fallocating again by Ackerley
is another fix but introduces a new function and is also awaiting review.

fs/hugetlbfs/inode.c | 8 +++-----
mm/hugetlb.c | 11 +++++------
2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 9062da6da5675..6d6cd8f26d76d 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
*/
struct folio *folio;
unsigned long addr;
- bool present;

cond_resched();

@@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
mutex_lock(&hugetlb_fault_mutex_table[hash]);

/* See if already present in mapping to avoid alloc/free */
- rcu_read_lock();
- present = page_cache_next_miss(mapping, index, 1) != index;
- rcu_read_unlock();
- if (present) {
+ folio = filemap_get_folio(mapping, index);
+ if (folio) {
+ folio_put(folio);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
hugetlb_drop_vma_policy(&pseudo_vma);
continue;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 245038a9fe4ea..29ab27d2a3ef5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5666,13 +5666,12 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
{
struct address_space *mapping = vma->vm_file->f_mapping;
pgoff_t idx = vma_hugecache_offset(h, vma, address);
- bool present;
-
- rcu_read_lock();
- present = page_cache_next_miss(mapping, idx, 1) != idx;
- rcu_read_unlock();
+ struct folio *folio;

- return present;
+ folio = filemap_get_folio(mapping, idx);
+ if (folio)
+ folio_put(folio);
+ return folio != NULL;
}

int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
--
2.40.0