Re: [PATCH] arm64: panic on synchronous external abort in kernel context

From: Mark Rutland
Date: Tue Apr 14 2020 - 06:59:43 EST


On Fri, Apr 10, 2020 at 09:52:45AM +0800, Xie XiuQi wrote:
> We should panic even panic_on_oops is not set, when we can't recover
> from synchronous external abort in kernel context.
>
> Othervise, there are two issues:
> 1) fallback to do_exit() in exception context, cause this core hung up.
> do_sea()
> -> arm64_notify_die
> -> die
> -> do_exit
> 2) errors may propagated.
>
> Signed-off-by: Xie XiuQi <xiexiuqi@xxxxxxxxxx>
> Cc: Xiaofei Tan <tanxiaofei@xxxxxxxxxx>
> ---
> arch/arm64/include/asm/esr.h | 12 ++++++++++++
> arch/arm64/kernel/traps.c | 2 ++
> 2 files changed, 14 insertions(+)
>
> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> index cb29253ae86b..acfc71c6d148 100644
> --- a/arch/arm64/include/asm/esr.h
> +++ b/arch/arm64/include/asm/esr.h
> @@ -326,6 +326,18 @@ static inline bool esr_is_data_abort(u32 esr)
> return ec == ESR_ELx_EC_DABT_LOW || ec == ESR_ELx_EC_DABT_CUR;
> }
>
> +static inline bool esr_is_inst_abort(u32 esr)
> +{
> + const u32 ec = ESR_ELx_EC(esr);
> +
> + return ec == ESR_ELx_EC_IABT_LOW || ec == ESR_ELx_EC_IABT_CUR;
> +}
> +
> +static inline bool esr_is_ext_abort(u32 esr)
> +{
> + return esr_is_data_abort(esr) || esr_is_inst_abort(esr);
> +}

A data abort or an intstruction abort are not necessarily synchronus
external aborts, so this isn't right.

What exactly are you trying to catch here? If you are seeing a problem
in practice, can you please share your log from a crash?

Thanks,
Mark.

> +
> const char *esr_get_class_string(u32 esr);
> #endif /* __ASSEMBLY */
>
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index cf402be5c573..08f7f7688d5b 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -202,6 +202,8 @@ void die(const char *str, struct pt_regs *regs, int err)
> panic("Fatal exception in interrupt");
> if (panic_on_oops)
> panic("Fatal exception");
> + if (esr_is_ext_abort(err))
> + panic("Synchronous external abort in kernel context");
>
> raw_spin_unlock_irqrestore(&die_lock, flags);
>
> --
> 2.20.1
>