Re: [PATCH v4] x86/power/64: Fix kernel text mapping corruption during image restoration

From: Andy Lutomirski
Date: Thu Jun 30 2016 - 11:24:38 EST


On Thu, Jun 30, 2016 at 8:17 AM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> On Thu, Jun 30, 2016 at 5:05 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
>> On Thu, Jun 30, 2016 at 03:17:20PM +0200, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>>>
>>> Logan Gunthorpe reports that hibernation stopped working reliably for
>>> him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table
>>> and rodata).
>>
>> ...
>>
>>> +static int relocate_restore_code(void)
>>> +{
>>> + pgd_t *pgd;
>>> + pud_t *pud;
>>> +
>>> + relocated_restore_code = get_safe_page(GFP_ATOMIC);
>>> + if (!relocated_restore_code)
>>> + return -ENOMEM;
>>> +
>>> + memcpy((void *)relocated_restore_code, &core_restore_code, PAGE_SIZE);
>>> +
>>> + /* Make the page containing the relocated code executable */
>>> + pgd = (pgd_t *)__va(read_cr3()) + pgd_index(relocated_restore_code);
>>> + pud = pud_offset(pgd, relocated_restore_code);
>>> + if (pud_large(*pud)) {
>>> + set_pud(pud, __pud(pud_val(*pud) & ~_PAGE_NX));
>>> + } else {
>>> + pmd_t *pmd = pmd_offset(pud, relocated_restore_code);
>>> +
>>> + if (pmd_large(*pmd)) {
>>> + set_pmd(pmd, __pmd(pmd_val(*pmd) & ~_PAGE_NX));
>>> + } else {
>>> + pte_t *pte = pte_offset_kernel(pmd, relocated_restore_code);
>>> +
>>> + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_NX));
>>> + }
>>> + }
>>> + flush_tlb_all();
>>
>> I know you want to flush TLBs but this causes the splat below on the
>> resume kernel.
>>
>> Most likely because:
>>
>> resume_target_kernel() does local_irq_disable() and then
>>
>> swsusp_arch_resume() -> relocate_restore_code() -> flush_tlb_all()
>>
>> and smp_call_function_many() doesn't like it when IRQs are disabled.
>
> Right.
>
> Can I invoke __flush_tlb_all() from here to flush the TLB on the local
> CPU? That should be sufficient IMO.

Yes, unless there's another CPU already up at the time, but that seems unlikely.

FWIW, the CPU can and will cache the NX bit and the CPU will *not*
refresh the TLB before sending a page fault. You definitely need to
flush. I wonder if this is also why you're seeing a certain amount of
random corruption.

--Andy