Re: [PATCH v2] arm64: Support Clang UBSAN trap codes for better reporting

From: Kees Cook
Date: Fri Feb 03 2023 - 14:25:02 EST


On Fri, Feb 03, 2023 at 11:14:49AM -0800, Fangrui Song wrote:
> On Fri, Feb 3, 2023 at 9:39 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> >
> > When building with CONFIG_UBSAN_TRAP=y on arm64, Clang encodes the UBSAN
> > check (handler) type in the esr. Extract this and actually report these
> > traps as coming from the specific UBSAN check that tripped.
> >
> > Before:
> >
> > Internal error: BRK handler: 00000000f20003e8 [#1] PREEMPT SMP
> >
> > After:
> >
> > Internal error: UBSAN: shift out of bounds: 00000000f2005514 [#1] PREEMPT SMP
> >
> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> > Cc: Will Deacon <will@xxxxxxxxxx>
> > Cc: Mark Rutland <mark.rutland@xxxxxxx>
> > Cc: John Stultz <jstultz@xxxxxxxxxx>
> > Cc: Yongqin Liu <yongqin.liu@xxxxxxxxxx>
> > Cc: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> > Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
> > Cc: Yury Norov <yury.norov@xxxxxxxxx>
> > Cc: Andrey Konovalov <andreyknvl@xxxxxxxxx>
> > Cc: Marco Elver <elver@xxxxxxxxxx>
> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > Cc: llvm@xxxxxxxxxxxxxxx
> > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> > ---
> > v2: improve commit log, limit report strings to actual configs, document mappings
> > v1: https://lore.kernel.org/lkml/20230202223653.never.473-kees@xxxxxxxxxx/
>
> Thanks. I'll add the Linux kernel use to
> https://maskray.me/blog/2023-01-29-all-about-undefined-behavior-sanitizer
> when this lands:)

Oh nice post! Thanks for the pointer. :)

>
> > ---
> > arch/arm64/include/asm/brk-imm.h | 2 +
> > arch/arm64/kernel/traps.c | 21 ++++++++++
> > include/linux/ubsan.h | 9 +++++
> > lib/Makefile | 2 -
> > lib/ubsan.c | 67 ++++++++++++++++++++++++++++++++
> > lib/ubsan.h | 32 +++++++++++++++
> > 6 files changed, 131 insertions(+), 2 deletions(-)
> > create mode 100644 include/linux/ubsan.h
> >
> > diff --git a/arch/arm64/include/asm/brk-imm.h b/arch/arm64/include/asm/brk-imm.h
> > index 6e000113e508..3f0f0d03268b 100644
> > --- a/arch/arm64/include/asm/brk-imm.h
> > +++ b/arch/arm64/include/asm/brk-imm.h
> > @@ -28,6 +28,8 @@
> > #define BUG_BRK_IMM 0x800
> > #define KASAN_BRK_IMM 0x900
> > #define KASAN_BRK_MASK 0x0ff
> > +#define UBSAN_BRK_IMM 0x5500
> > +#define UBSAN_BRK_MASK 0x00ff
>
> Q: How is 0x5500 derived?

This is 'U' << 8 from:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L7571

> [...]
> > +#ifdef CONFIG_UBSAN_TRAP
> > + register_kernel_break_hook(&ubsan_break_hook);
> > #endif
>
> IIUC, the break hook is a list so CONFIG_KASAN_SW_TAGS
> (kernel-hwaddress) can be used with CONFIG_UBSAN_TRAP.

Should I be doing something different here?

> [...]
> > +#ifdef CONFIG_UBSAN_ALIGNMENT
> > + /*
> > + * SanitizerKind::Alignment emits SanitizerHandler::TypeMismatch
> > + * or SanitizerHandler::AlignmentAssumption.
> > + */
> > + case ubsan_alignment_assumption:
> > + return "UBSAN: alignment assumption";
> > + case ubsan_type_mismatch:
> > + return "UBSAN: type mismatch";
> > +#endif
> > + default:
> > + return "UBSAN: unrecognized failure code";
> > + }
> > +}
>
> I wonder whether keeping the dash-prefixed name is better since that
> matches compiler-rt/lib/ubsan.
> People can search for "add-overflow" and get cross references from
> compiler-rt/lib/ubsan, instead of needing to knowing that "addition
> overflow" is another name for "add-overflow".

I think that the consumer of these messages wants as much plain-language
detail as possible, so I'd prefer to expand these into full phrasing. To
make it all more discoverable, I included all the details about how the
mapping worked in the comments.

> [...]
> Reviewed-by: Fangrui Song <maskray@xxxxxxxxxx>

Thanks!

--
Kees Cook