Re: [PATCH v11 0/3] ACPI: APEI: handle synchronous exceptions in task work to send correct SIGBUS si_code

From: Shuai Xue
Date: Sun Feb 18 2024 - 20:47:23 EST


Hi, James and Borislav,

Gentle Ping. Any feedback to this new version?

Thank you.

Best Regards,
Shuai

On 2024/2/4 16:01, Shuai Xue wrote:
> ## Changes Log
> changes since v10:
> - rebase to v6.8-rc2
>
> changes since v9:
> - split patch 2 to address exactly one issue in one patch (per Borislav)
> - rewrite commit log according to template (per Borislav)
> - pickup reviewed-by tag of patch 1 from James Morse
> - alloc and free twcb through gen_pool_{alloc, free) (Per James)
> - rewrite cover letter
>
> changes since v8:
> - remove the bug fix tag of patch 2 (per Jarkko Sakkinen)
> - remove the declaration of memory_failure_queue_kick (per Naoya Horiguchi)
> - rewrite the return value comments of memory_failure (per Naoya Horiguchi)
>
> changes since v7:
> - rebase to Linux v6.6-rc2 (no code changed)
> - rewritten the cover letter to explain the motivation of this patchset
>
> changes since v6:
> - add more explicty error message suggested by Xiaofei
> - pick up reviewed-by tag from Xiaofei
> - pick up internal reviewed-by tag from Baolin
>
> changes since v5 by addressing comments from Kefeng:
> - document return value of memory_failure()
> - drop redundant comments in call site of memory_failure()
> - make ghes_do_proc void and handle abnormal case within it
> - pick up reviewed-by tag from Kefeng Wang
>
> changes since v4 by addressing comments from Xiaofei:
> - do a force kill only for abnormal sync errors
>
> changes since v3 by addressing comments from Xiaofei:
> - do a force kill for abnormal memory failure error such as invalid PA,
> unexpected severity, OOM, etc
> - pcik up tested-by tag from Ma Wupeng
>
> changes since v2 by addressing comments from Naoya:
> - rename mce_task_work to sync_task_work
> - drop ACPI_HEST_NOTIFY_MCE case in is_hest_sync_notify()
> - add steps to reproduce this problem in cover letter
>
> changes since v1:
> - synchronous events by notify type
> - Link: https://lore.kernel.org/lkml/20221206153354.92394-3-xueshuai@xxxxxxxxxxxxxxxxx/
>
> ## Cover Letter
>
> There are two major types of uncorrected recoverable (UCR) errors :
>
> - Synchronous error: The error is detected and raised at the point of the
> consumption in the execution flow, e.g. when a CPU tries to access
> a poisoned cache line. The CPU will take a synchronous error exception
> such as Synchronous External Abort (SEA) on Arm64 and Machine Check
> Exception (MCE) on X86. OS requires to take action (for example, offline
> failure page/kill failure thread) to recover this uncorrectable error.
>
> - Asynchronous error: The error is detected out of processor execution
> context, e.g. when an error is detected by a background scrubber. Some data
> in the memory are corrupted. But the data have not been consumed. OS is
> optional to take action to recover this uncorrectable error.
>
> Since commit a70297d22132 ("ACPI: APEI: set memory failure flags as
> MF_ACTION_REQUIRED on synchronous events")', the flag MF_ACTION_REQUIRED
> could be used to determine whether a synchronous exception occurs on ARM64
> platform. When a synchronous exception is detected, the kernel should
> terminate the current process which accessing the poisoned page. This is
> done by sending a SIGBUS signal with an error code BUS_MCEERR_AR,
> indicating an action-required machine check error on read.
>
> However, the memory failure recovery is incorrectly sending a SIGBUS
> with wrong error code BUS_MCEERR_AO for synchronous errors in early kill
> mode, even MF_ACTION_REQUIRED is set. The main problem is that
> synchronous errors are queued as a memory_failure() work, and are
> executed within a kernel thread context, not the user-space process that
> encountered the corrupted memory on ARM64 platform. As a result, when
> kill_proc() is called to terminate the process, it sends the incorrect
> SIGBUS error code because the context in which it operates is not the
> one where the error was triggered.
>
> To this end, fix the problem by:
>
> - Patch 1: performing a force kill if no memory_failure() work is queued for
> synchronous errors.
> - Patch 2: a minor comments improvement.
> - Patch 3: queue memory_failure() as a task_work so that it runs in the
> context of the process that is actually consuming the poisoned
> data, and it will send SIBBUS with si_code BUS_MCEERR_AR.
>
> Lv Ying and XiuQi from Huawei also proposed to address similar problem[2][4].
> Acknowledge to discussion with them.
>
> ## Steps to Reproduce This Problem
>
> To reproduce this problem:
>
> # STEP1: enable early kill mode
> #sysctl -w vm.memory_failure_early_kill=1
> vm.memory_failure_early_kill = 1
>
> # STEP2: inject an UCE error and consume it to trigger a synchronous error
> #einj_mem_uc single
> 0: single vaddr = 0xffffb0d75400 paddr = 4092d55b400
> injecting ...
> triggering ...
> signal 7 code 5 addr 0xffffb0d75000
> page not present
> Test passed
>
> The si_code (code 5) from einj_mem_uc indicates that it is BUS_MCEERR_AO error
> and it is not fact.
>
> After this patch set:
>
> # STEP1: enable early kill mode
> #sysctl -w vm.memory_failure_early_kill=1
> vm.memory_failure_early_kill = 1
>
> # STEP2: inject an UCE error and consume it to trigger a synchronous error
> #einj_mem_uc single
> 0: single vaddr = 0xffffb0d75400 paddr = 4092d55b400
> injecting ...
> triggering ...
> signal 7 code 4 addr 0xffffb0d75000
> page not present
> Test passed
>
> The si_code (code 4) from einj_mem_uc indicates that it is BUS_MCEERR_AR error
> as we expected.
>
> [1] Add ARMv8 RAS virtualization support in QEMU https://patchew.org/QEMU/20200512030609.19593-1-gengdongjiu@xxxxxxxxxx/
> [2] https://lore.kernel.org/lkml/20221205115111.131568-3-lvying6@xxxxxxxxxx/
> [3] https://lkml.kernel.org/r/20220914064935.7851-1-xueshuai@xxxxxxxxxxxxxxxxx
> [4] https://lore.kernel.org/lkml/20221209095407.383211-1-lvying6@xxxxxxxxxx/
>
> Shuai Xue (3):
> ACPI: APEI: send SIGBUS to current task if synchronous memory error
> not recovered
> mm: memory-failure: move return value documentation to function
> declaration
> ACPI: APEI: handle synchronous exceptions in task work to send correct
> SIGBUS si_code
>
> arch/x86/kernel/cpu/mce/core.c | 9 +---
> drivers/acpi/apei/ghes.c | 84 +++++++++++++++++++++-------------
> include/acpi/ghes.h | 3 --
> mm/memory-failure.c | 22 +++------
> 4 files changed, 59 insertions(+), 59 deletions(-)
>