Re: [PATCH 1/1] mm:improve the performance during fork

From: Souptick Joarder
Date: Tue Dec 22 2020 - 10:08:51 EST


On Tue, Dec 22, 2020 at 5:49 PM <qianjun.kernel@xxxxxxxxx> wrote:
>
> From: jun qian <qianjun.kernel@xxxxxxxxx>
>
> In our project, Many business delays come from fork, so
> we started looking for the reason why fork is time-consuming.
> I used the ftrace with function_graph to trace the fork, found
> that the vm_normal_page will be called tens of thousands and
> the execution time of this vm_normal_page function is only a
> few nanoseconds. And the vm_normal_page is not a inline function.
> So I think if the function is inline style, it maybe reduce the
> call time overhead.
>
> I did the following experiment:
>
> I have wrote the c test code, pls ignore the memory leak :)
> Before fork, I will malloc 4G bytes, then acculate the fork
> time.
>
> int main()
> {
> char *p;
> unsigned long long i=0;
> float time_use=0;
> struct timeval start;
> struct timeval end;
>
> for(i=0; i<LEN; i++) {
> p = (char *)malloc(4096);
> if (p == NULL) {
> printf("malloc failed!\n");
> return 0;
> }
> p[0] = 0x55;
> }
> gettimeofday(&start,NULL);
> fork();
> gettimeofday(&end,NULL);
>
> time_use=(end.tv_sec * 1000000 + end.tv_usec) -
> (start.tv_sec * 1000000 + start.tv_usec);
> printf("time_use is %.10f us\n",time_use);
>
> return 0;
> }
>
> We need to compare the changes in the size of vmlinux, the time of
> fork in inline and non-inline cases, and the vm_normal_page will be
> called in many function. So we also need to compare this function's
> size. For examples, the do_wp_page will call vm_normal_page, so I
> also calculated it's size.
>
> inline non-inline diff
> vmlinux size 9709248 bytes 9709824 bytes -576 bytes
> fork time 23475ns 24638ns -4.7%

Do you have time diff for both parent and child process ?

> do_wp_page size 972 743 +229
>
> According to the above test data, I think inline vm_normal_page can
> reduce fork execution time.
>
> Signed-off-by: jun qian <qianjun.kernel@xxxxxxxxx>
> ---
> mm/memory.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 7d608765932b..a689bb5d3842 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -591,7 +591,7 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
> * PFNMAP mappings in order to support COWable mappings.
> *
> */
> -struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> +inline struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> pte_t pte)
> {
> unsigned long pfn = pte_pfn(pte);
> --
> 2.18.2
>
>