Re: [PATCH 4.4 00/37] 4.4.110-stable review

From: Pavel Tatashin
Date: Thu Jan 11 2018 - 15:20:06 EST




On 01/11/2018 03:10 PM, Greg Kroah-Hartman wrote:
On Thu, Jan 11, 2018 at 01:36:50PM -0500, Pavel Tatashin wrote:
I have root caused the memory corruption panics/hangs that I've been
experiencing during boot with the latest 4.4.110 kernel. The problem
as was suspected by Andy Lutomirski is with interaction between PTI
and EFI. It may affect any system that has EFI bios. I have not
verified if it can affect any other kernel beside 4.4.110

Attached is the fix for this issue with explanations that Steve
Sistare and I developed.

Nice, but why does this not show up in 4.9 and 4.14 and Linus's tree as
well on this hardware? Nor on the SLES12 SP3 kernel?

What is different there that 4.4 requires? That worries me more than
your fix (which looks good to me, fwiw.)

Hi Greg,

I have not studied other versions of kernels, efi was changed substantially since 4.4. But, even on 4.4.110 there are several things have to happen for this bug to show-up:

1. During boot memmblock must allocate address that is not 2PAGE_SIZE aligned.
2. nmi must arrive exactly when EFI replaced page table.

While I was debugging this problem, I tried to enable, kasan, vm_debug, add more printfs etc, but every little change would cause this problem to disappear, or appear less frequently.

Thank you,
Pavel