Re: [PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races

From: Borislav Petkov
Date: Tue Apr 20 2021 - 06:17:03 EST


On Tue, Apr 20, 2021 at 07:46:26AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> If you have any other suggestion, please let me know.

Looks almost ok...

> From: Tony Luck <tony.luck@xxxxxxxxx>
> Date: Tue, 20 Apr 2021 16:42:01 +0900
> Subject: [PATCH 1/3] mm/memory-failure: Use a mutex to avoid memory_failure()
> races
>
> There can be races when multiple CPUs consume poison from the same
> page. The first into memory_failure() atomically sets the HWPoison
> page flag and begins hunting for tasks that map this page. Eventually
> it invalidates those mappings and may send a SIGBUS to the affected
> tasks.
>
> But while all that work is going on, other CPUs see a "success"
> return code from memory_failure() and so they believe the error
> has been handled and continue executing.
>
> Fix by wrapping most of the internal parts of memory_failure() in
> a mutex.
>
> Along with introducing an additional goto label, this patch also

... avoid having "This patch" or "This commit" in the commit message.
It is tautologically useless. Also, you don't have to explain what the
patch does - that's visible hopefully. :-)

Other than that, it makes sense and the "sandwitching" looks correct:

mutex_lock
lock_page

...

unlock_page
mutex_unlock

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette