Re: [PATCH] mm: Move __vma_address() to internal.h to be inlined in huge_memory.c

From: Waiman Long
Date: Mon Jun 16 2014 - 15:34:58 EST


On 06/12/2014 05:45 PM, David Rientjes wrote:
On Thu, 12 Jun 2014, Waiman Long wrote:

The vma_address() function which is used to compute the virtual address
within a VMA is used only by 2 files in the mm subsystem - rmap.c and
huge_memory.c. This function is defined in rmap.c and is inlined by
its callers there, but it is also declared as an external function.

However, the __split_huge_page() function which calls vma_address()
in huge_memory.c is calling it as a real function call. This is not
as efficient as an inlined function. This patch moves the underlying
inlined __vma_address() function to internal.h to be shared by both
the rmap.c and huge_memory.c file.
This increases huge_memory.o's text+data_bss by 311 bytes, which makes
me suspect that it is a bad change due to its increase of kernel cache
footprint.

Perhaps we should be noinlining __vma_address()?
On my test machine, I saw an increase of 144 bytes in the text segment
of huge_memory.o. The size in size is caused by an increase in the size
of the __split_huge_page function. When I remove the

if (unlikely(is_vm_hugetlb_page(vma)))
pgoff = page->index<< huge_page_order(page_hstate(page));

check, the increase in size drops down to 24 bytes. As a THP cannot be
a hugetlb page, there is no point in doing this check for a THP. I will
update the patch to pass in an additional argument to disable this
check for __split_huge_page.

I think we're seeking a reason or performance numbers that suggest
__vma_address() being inline is appropriate and so far we lack any such
evidence. Adding additional parameters to determine checks isn't going to
change the fact that it increases text size needlessly.

This patch was motivated by my investigation of a freeze problem of an application running on SLES11 sp3 which was caused by the long time it took to munmap part of a THP. Inlining vma_address help a bit in that situation. However, the problem will be essentially gone after including patches that changing the anon_vma_chain to use rbtree instead of a simple list.

I do agree that performance impact of inlining vma_address in minimal in the latest kernel. So I am not going to pursue this any further.

Thank for the review.

-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/