Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page

From: Peter Zijlstra
Date: Tue Apr 09 2019 - 05:39:46 EST


On Tue, Apr 09, 2019 at 10:40:31AM +0200, Peter Zijlstra wrote:
> On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote:
> > The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init()
> > PTE population") triggers the below warning in the SEV guest.
> >
> > WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
> > Call Trace:
> > kernel_physical_mapping_init+0xce/0x259
> > early_set_memory_enc_dec+0x10f/0x160
> > kvm_smp_prepare_boot_cpu+0x71/0x9d
> > start_kernel+0x1c9/0x50b
> > secondary_startup_64+0xa4/0xb0
> >
> > The SEV guest calls kernel_physical_mapping_init() to clear the encryption
> > mask from an existing mapping. While clearing the encryption mask
> > kernel_physical_mapping_init() splits the large pages into the smaller.
> > To split the page, the kernel_physical_mapping_init() allocates a new page
> > and updates the existing entry. The set_{pud,pmd}_safe triggers warning
> > when updating the entry with page in the present state. We should use the
> > set_{pud,pmd} when updating an existing entry with the new entry.
> >
> > Updating an entry will also requires a TLB flush. Currently the caller
> > (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes.
>
> I'm not entirely sure I like this, this means all users of
> kernel_physical_mapping_init() now need to be aware and careful.
>
> That said; the alternative is adding an argument to the function and
> propagating it through the callchain and dynamically switching between
> _safe and not. Which doesn't sound ideal either.
>
> Anybody else got clever ideas?

The more I think about it, I think that is in fact the right thing to
do.

Rename kernel_physical_mapping_init() to __& and add that flag, thread
it down to all the set_{pud,pmd.pte}() thingies. Then add:

unsigned long kernel_physical_mapping_init(unsigned long paddr_start, unsigned
long paddr_end, unsigned long page_size_mask)
{
return __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, true);
}

unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned
long paddr_end, unsigned long page_size_mask)
{
unsigned long last;

last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false);

__flush_tlb_all();

return last;
}

Or something along those lines.