Re: [PATCH 10/12] hugetlb: batch PMD split for bulk vmemmap dedup

From: Mike Kravetz
Date: Mon Aug 28 2023 - 12:46:58 EST


On 08/28/23 10:42, Joao Martins wrote:
> On 26/08/2023 06:56, kernel test robot wrote:
> > Hi Mike,
> >
> > kernel test robot noticed the following build errors:
> >
> > [auto build test ERROR on next-20230825]
> > [cannot apply to akpm-mm/mm-everything v6.5-rc7 v6.5-rc6 v6.5-rc5 linus/master v6.5-rc7]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> >
> > url: https://github.com/intel-lab-lkp/linux/commits/Mike-Kravetz/hugetlb-clear-flags-in-tail-pages-that-will-be-freed-individually/20230826-030805
> > base: next-20230825
> > patch link: https://lore.kernel.org/r/20230825190436.55045-11-mike.kravetz%40oracle.com
> > patch subject: [PATCH 10/12] hugetlb: batch PMD split for bulk vmemmap dedup
> > config: s390-randconfig-001-20230826 (https://download.01.org/0day-ci/archive/20230826/202308261325.ipTttZHZ-lkp@xxxxxxxxx/config)
> > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > reproduce: (https://download.01.org/0day-ci/archive/20230826/202308261325.ipTttZHZ-lkp@xxxxxxxxx/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@xxxxxxxxx>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202308261325.ipTttZHZ-lkp@xxxxxxxxx/
> >
> > All error/warnings (new ones prefixed by >>):
> >
>
> [...]
>
> >>> mm/hugetlb_vmemmap.c:698:28: error: use of undeclared identifier 'TLB_FLUSH_ALL'
> > 698 | flush_tlb_kernel_range(0, TLB_FLUSH_ALL);
> > | ^
> > 2 warnings and 1 error generated.
> >
> >
>
> TLB_FLUSH_ALL is x86 only so what I wrote above is wrong in what should be
> architecture independent. The way I should have written the global TLB flush is
> to use flush_tlb_all(), which is what is implemented by the arch.
>
> The alternative is to compose a start/end tuple in the top-level optimize-folios
> function as we iterate over folios to remap, and flush via
> flush_tlb_kernel_range(). But this would likely only be relevant on x86 only,
> that is to optimize the flushing of 3 contiguous 2M hugetlb pages (~24 vmemmap
> pages) as that's where the TLB flush ceiling is put (31 pages) for per-page VA
> flush, before falling back to a global TLB flush. Weren't sure of the added
> complexity for dubious benefit thus kept it in global TLB flush.

Thanks Joao.

I added my share of build issues to this RFC as can be seen in the bot
responses to other patches.

My assumption is that these build issues will not prevent people from
looking into and commenting on the bigger performance issue that was the
reason for this series. The build issues would of course be resolved if
there is some concensus that this is the way to move forward to address
this issue. If the build issues are a stumbling block for anyone to
look at this bigger issue, let me know and I will fix them all ASAP.
--
Mike Kravetz