Re: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork()

From: Ryan Roberts
Date: Thu Nov 16 2023 - 05:08:03 EST


Hi All,

Hoping for some guidance below!


On 15/11/2023 21:26, kernel test robot wrote:
> Hi Ryan,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on akpm-mm/mm-everything]
> [also build test ERROR on linus/master v6.7-rc1 next-20231115]
> [cannot apply to arm64/for-next/core efi/next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Ryan-Roberts/mm-Batch-copy-PTE-ranges-during-fork/20231116-010123
> base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link: https://lore.kernel.org/r/20231115163018.1303287-2-ryan.roberts%40arm.com
> patch subject: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork()
> config: arm-randconfig-002-20231116 (https://download.01.org/0day-ci/archive/20231116/202311160516.kHhfmjvl-lkp@xxxxxxxxx/config)
> compiler: arm-linux-gnueabi-gcc (GCC) 13.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231116/202311160516.kHhfmjvl-lkp@xxxxxxxxx/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@xxxxxxxxx>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202311160516.kHhfmjvl-lkp@xxxxxxxxx/
>
> All errors (new ones prefixed by >>):
>
> mm/memory.c: In function 'folio_nr_pages_cont_mapped':
>>> mm/memory.c:969:16: error: implicit declaration of function 'pte_pgprot'; did you mean 'ptep_get'? [-Werror=implicit-function-declaration]
> 969 | prot = pte_pgprot(pte_mkold(pte_mkclean(ptent)));
> | ^~~~~~~~~~
> | ptep_get
> cc1: some warnings being treated as errors

It turns out that pte_pgprot() is not universal; its only implemented by
architectures that select CONFIG_HAVE_IOREMAP_PROT (currently arc, arm64,
loongarch, mips, powerpc, s390, sh, x86).

I'm using it in core-mm to help calculate the number of "contiguously mapped"
pages within a folio (note that's not the same as arm64's notion of
contpte-mapped. I just want to know that there are N physically contiguous pages
mapped virtually contiguously with the same permissions). And I'm using
pte_pgprot() to extract the permissions for each pte to compare. It's important
that we compare the permissions because just because the pages belongs to the
same folio doesn't imply they are mapped with the same permissions; think
mprotect()ing a sub-range.

I don't have a great idea for how to fix this - does anyone have any thoughts?

Some ideas:

- Implement folio_nr_pages_cont_mapped() conditionally on
CONFIG_HAVE_IOREMAP_PROT being set, otherwise it just returns 1 and for those
arches we always get the old, non-batching behavior. There is some precident;
mm/memory.c is already using pte_pgprot() behind this ifdef.

- Implement a generic helper the same way arm64 does it. This will return all
the pte bits that are not part of the PFN. But I'm not sure this is definitely a
valid thing to do for all architectures:

static inline pgprot_t pte_pgprot(pte_t pte)
{
unsigned long pfn = pte_pfn(pte);

return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte));
}

- Explicitly implement pte_pgprot() for all arches that don't currently have it
(sigh).

Thanks,
Ryan


>
>
> vim +969 mm/memory.c
>
> 950
> 951 static int folio_nr_pages_cont_mapped(struct folio *folio,
> 952 struct page *page, pte_t *pte,
> 953 unsigned long addr, unsigned long end,
> 954 pte_t ptent, bool *any_dirty)
> 955 {
> 956 int floops;
> 957 int i;
> 958 unsigned long pfn;
> 959 pgprot_t prot;
> 960 struct page *folio_end;
> 961
> 962 if (!folio_test_large(folio))
> 963 return 1;
> 964
> 965 folio_end = &folio->page + folio_nr_pages(folio);
> 966 end = min(page_cont_mapped_vaddr(folio_end, page, addr), end);
> 967 floops = (end - addr) >> PAGE_SHIFT;
> 968 pfn = page_to_pfn(page);
> > 969 prot = pte_pgprot(pte_mkold(pte_mkclean(ptent)));
> 970
> 971 *any_dirty = pte_dirty(ptent);
> 972
> 973 pfn++;
> 974 pte++;
> 975
> 976 for (i = 1; i < floops; i++) {
> 977 ptent = ptep_get(pte);
> 978 ptent = pte_mkold(pte_mkclean(ptent));
> 979
> 980 if (!pte_present(ptent) || pte_pfn(ptent) != pfn ||
> 981 pgprot_val(pte_pgprot(ptent)) != pgprot_val(prot))
> 982 break;
> 983
> 984 if (pte_dirty(ptent))
> 985 *any_dirty = true;
> 986
> 987 pfn++;
> 988 pte++;
> 989 }
> 990
> 991 return i;
> 992 }
> 993
>