Re: [PATCH v4 3/3] scripts/faddr2line: Skip over mapping symbols in output from readelf

From: Nick Desaulniers
Date: Mon Sep 18 2023 - 11:53:04 EST


On Thu, Sep 14, 2023 at 6:12 AM Will Deacon <will@xxxxxxxxxx> wrote:
>
> Mapping symbols emitted in the readelf output can confuse the
> 'faddr2line' symbol size calculation, resulting in the erroneous
> rejection of valid offsets. This is especially prevalent when building
> an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> prefixed with a 32-bit data value in a '$d.n' section. For example:
>
> 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall
> 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73
> 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75
> 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78
> 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79
> 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process
>
> Adding a warning to do_one_initcall() results in:
>
> | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
>
> Which 'faddr2line' refuses to accept:
>
> $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> no match for do_one_initcall+0xf4/0x260
>
> Filter out these entries from readelf using a shell reimplementation of
> is_mapping_symbol(), so that the size of a symbol is calculated as a
> delta to the next symbol present in ksymtab.
>
> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Cc: John Stultz <jstultz@xxxxxxxxxx>
> Suggested-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
> Signed-off-by: Will Deacon <will@xxxxxxxxxx>
> ---
> scripts/faddr2line | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/scripts/faddr2line b/scripts/faddr2line
> index 6b8206802157..20d9b3d37843 100755
> --- a/scripts/faddr2line
> +++ b/scripts/faddr2line
> @@ -179,6 +179,11 @@ __faddr2line() {
> local cur_sym_elf_size=${fields[2]}
> local cur_sym_name=${fields[7]:-}
>
> + # is_mapping_symbol(cur_sym_name)
> + if [[ ${cur_sym_name} =~ ^((\.L)|(L0)|(\$[adtx](\.|$))) ]]; then

Thanks for the patch!

I'm curious about the `|$` in the final part of the regex. IIUC that
will match something like
$a
Do we have any such symbols without `.<n>` suffixes?

With aarch64 defconfig + cfi:
$ llvm-readelf -s vmlinux | grep '\$' | rev | cut -d ' ' -f 1 | rev | sort -u
I only see $d.<n> and $x.<n> where the initial value of <n> is zero
(as opposed to no `.<n>` suffix).
Can we tighten up that last part of the regex to be `\$[adtx]\.[0-9]+$` ?
Or perhaps you've observed mapping symbols use another convention than
what clang is doing?

https://sourceware.org/binutils/docs/as/AArch64-Mapping-Symbols.html
also only mentions $d and $x. Ah,
https://developer.arm.com/documentation/dui0803/a/Accessing-and-managing-symbols-with-armlink/About-mapping-symbols
mentions $a for A32 and $t for T32.
Consider adding a link to the ARM documentation on mapping symbols in
the commit message?

(Curiously, `llvm-nm` does not print these symbols, but `llvm-readelf -s` does).

> + continue
> + fi
> +
> if [[ $cur_sym_addr = $sym_addr ]] &&
> [[ $cur_sym_elf_size = $sym_elf_size ]] &&
> [[ $cur_sym_name = $sym_name ]]; then
> --
> 2.42.0.283.g2d96d420d3-goog
>


--
Thanks,
~Nick Desaulniers