Re: [PATCH v2] mm: hwpoison: coredump: support recovery from dump_user_range()

From: Kefeng Wang
Date: Mon Apr 24 2023 - 21:48:01 EST




On 2023/4/25 0:17, Luck, Tony wrote:
This change seems to not related to what you try to fix.
Could this break some other workloads like copying from user address?


Yes, this move MCE_IN_KERNEL_COPYIN set into next case, both COPY and
MCE_SAFE type will set MCE_IN_KERNEL_COPYIN, for EX_TYPE_COPY, we don't
break it.

Should Linux even try to take a core dump for a SIGBUS generated because
the application accessed a poisoned page?

It doesn't seem like it would be useful. Core dumps are for debugging s/w
program errors in applications and libraries. That isn't the case when there
is a poison consumption. The application did nothing wrong.

This patch is still useful though. There may be an undiscovered poison
page in the application. Avoiding a kernel crash when dumping core
is still a good thing.

Thanks for your confirm, and what your option about add
MCE_IN_KERNEL_COPYIN to EX_TYPE_DEFAULT_MCE_SAFE/FAULT_MCE_SAFE type
to let do_machine_check call queue_task_work(&m, msg, kill_me_never),
which kill every call memory_failure_queue() after mc safe copy return?


-Tony