Re: [PATCH] x86/mm: synchronize pgd in vmemmap_free()

From: John Hubbard
Date: Fri May 19 2017 - 20:35:03 EST


Hi Jerome,

On 05/19/2017 11:01 AM, JÃrÃme Glisse wrote:
When we free kernel virtual map we should synchronize p4d/pud for
all the pgds to avoid any stall entry in non canonical pgd.

"any stale entry in the non-canonical pgd", is what I think you meant to type there.

Also, it would be nice to clarify that commit description a bit: I'm not sure what is meant here by a "non-canonical pgd".

Also, it seems like the reshuffling of the internals of sync_global_pgds() deserves at least some mention here. More below.


Signed-off-by: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
---
arch/x86/mm/init_64.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index ff95fe8..df753f8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -108,8 +108,6 @@ void sync_global_pgds(unsigned long start, unsigned long end)
BUILD_BUG_ON(pgd_none(*pgd_ref));
p4d_ref = p4d_offset(pgd_ref, address);
- if (p4d_none(*p4d_ref))
- continue;
spin_lock(&pgd_lock);
list_for_each_entry(page, &pgd_list, lru) {
@@ -123,12 +121,16 @@ void sync_global_pgds(unsigned long start, unsigned long end)
pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
spin_lock(pgt_lock);
- if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))
- BUG_ON(p4d_page_vaddr(*p4d)
- != p4d_page_vaddr(*p4d_ref));
-
- if (p4d_none(*p4d))
+ if (p4d_none(*p4d_ref)) {
set_p4d(p4d, *p4d_ref);

Is the intention really to set p4d to a zeroed *p4d_ref, or is that a mistake?

+ } else {
+ if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))

I think the code needs to be somewhat restructured, but as it stands, the above !p4d_none(*p4d_ref) will always be true, because first part of the if/else checked for the opposite case: p4d_none(*p4d_ref). This is a side effect of moving that block of code.

+ BUG_ON(p4d_page_vaddr(*p4d)
+ != p4d_page_vaddr(*p4d_ref));
+
+ if (p4d_none(*p4d))
+ set_p4d(p4d, *p4d_ref);
+ }
spin_unlock(pgt_lock);
}
@@ -1024,6 +1026,7 @@ remove_pagetable(unsigned long start, unsigned long end, bool direct)
void __ref vmemmap_free(unsigned long start, unsigned long end)
{
remove_pagetable(start, end, false);
+ sync_global_pgds(start, end - 1);

This does fix the HMM crash that I was seeing in hmm-next.

thanks,
John Hubbard
NVIDIA

}
#ifdef CONFIG_MEMORY_HOTREMOVE
--
2.4.11

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>