Re: [PATCH v3 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

From: Borislav Petkov
Date: Thu Apr 22 2021 - 13:01:28 EST


On Wed, Apr 21, 2021 at 09:57:26AM +0900, Naoya Horiguchi wrote:
> From: Tony Luck <tony.luck@xxxxxxxxx>
>
> There can be races when multiple CPUs consume poison from the same
> page. The first into memory_failure() atomically sets the HWPoison
> page flag and begins hunting for tasks that map this page. Eventually
> it invalidates those mappings and may send a SIGBUS to the affected
> tasks.
>
> But while all that work is going on, other CPUs see a "success"
> return code from memory_failure() and so they believe the error
> has been handled and continue executing.
>
> Fix by wrapping most of the internal parts of memory_failure() in
> a mutex.
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> ---
> mm/memory-failure.c | 37 ++++++++++++++++++++++++-------------
> 1 file changed, 24 insertions(+), 13 deletions(-)

Reviewed-by: Borislav Petkov <bp@xxxxxxx>

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette