Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO

From: Liang Li
Date: Tue Dec 22 2020 - 09:43:09 EST


> > =====================================================
> > QEMU use 4K pages, THP is off
> > round1 round2 round3
> > w/o this patch: 23.5s 24.7s 24.6s
> > w/ this patch: 10.2s 10.3s 11.2s
> >
> > QEMU use 4K pages, THP is on
> > round1 round2 round3
> > w/o this patch: 17.9s 14.8s 14.9s
> > w/ this patch: 1.9s 1.8s 1.9s
> > =====================================================
>
> The cost of zeroing pages has to be paid somewhere. You've successfully
> moved it out of this path that you can measure. So now you've put it
> somewhere that you're not measuring. Why is this a win?

Win or not depends on its effect. For our case, it solves the issue that we
faced, so it can be thought as a win for us.
If others don't have the issue we faced, the result will be different,
maybe they
will be affected by the side effect of this feature. I think this is
your concern
behind the question. right? I will try to do more tests and provide more
benchmark performance data.

> > Speed up kernel routine
> > =======================
> > This can’t be guaranteed because we don’t pre zero out all the free pages,
> > but is true for most case. It can help to speed up some important system
> > call just like fork, which will allocate zero pages for building page
> > table. And speed up the process of page fault, especially for huge page
> > fault. The POC of Hugetlb free page pre zero out has been done.
>
> Try kernbench with and without your patch.

OK. Thanks for your suggestion!

Liang