Re: [External] Re: [PATCH v2 00/19] Free some vmemmap pages of hugetlb page

From: Michal Hocko
Date: Fri Oct 30 2020 - 11:20:46 EST


On Fri 30-10-20 18:24:25, Muchun Song wrote:
> On Fri, Oct 30, 2020 at 5:14 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Mon 26-10-20 22:50:55, Muchun Song wrote:
> > > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very
> > > substantial gain. On our server, run some SPDK/QEMU applications which
> > > will use 1000GB hugetlbpage. With this feature enabled, we can save
> > > ~16GB(1G hugepage)/~11GB(2MB hugepage) memory.
> > [...]
> > > 15 files changed, 1091 insertions(+), 165 deletions(-)
> > > create mode 100644 include/linux/bootmem_info.h
> > > create mode 100644 mm/bootmem_info.c
> >
> > This is a neat idea but the code footprint is really non trivial. To a
> > very tricky code which hugetlb is unfortunately.
> >
> > Saving 1,6% of memory is definitely interesting especially for 1GB pages
> > which tend to be more static and where the savings are more visible.
> >
> > Anyway, I haven't seen any runtime overhead analysis here. What is the
> > price to modify the vmemmap page tables and make them pte rather than
> > pmd based (especially for 2MB hugetlb). Also, how expensive is the
> > vmemmap page tables reconstruction on the freeing path?
>
> Yeah, I haven't tested the remapping overhead of reserving a hugetlb
> page. I can do that. But the overhead is not on the allocation/freeing of
> each hugetlb page, it is only once when we reserve some hugetlb pages
> through /proc/sys/vm/nr_hugepages. Once the reservation is successful,
> the subsequent allocation, freeing and using are the same as before
> (not patched).

Yes, that is quite clear. Except for the hugetlb overcommit and
migration if the pool is depeleted. Maybe few other cases.

> So I think that the overhead is acceptable.

Having some numbers for a such a large feature is really needed.
--
Michal Hocko
SUSE Labs