Re: [PATCH] kallsyms: Fix kallsyms_selftest failure

From: Song Liu
Date: Fri Aug 25 2023 - 02:54:08 EST




> On Aug 24, 2023, at 8:46 PM, Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
>
> Kernel test robot reported a kallsyms_test failure when clang lto is
> enabled (thin or full) and CONFIG_KALLSYMS_SELFTEST is also enabled.
> I can reproduce in my local environment with the following error message
> with thin lto:
> [ 1.877897] kallsyms_selftest: Test for 1750th symbol failed: (tsc_cs_mark_unstable) addr=ffffffff81038090
> [ 1.877901] kallsyms_selftest: abort
>
> It appears that commit 8cc32a9bbf29 ("kallsyms: strip LTO-only suffixes
> from promoted global functions") caused the failure. Commit 8cc32a9bbf29
> changed cleanup_symbol_name() based on ".llvm." instead of '.' where
> ".llvm." is appended to a before-lto-optimization local symbol name.
> We need to propagate such knowledge in kallsyms_selftest.c as well.
>
> Further more, compare_symbol_name() in kallsyms.c needs change as well.
> In scripts/kallsyms.c, kallsyms_names and kallsyms_seqs_of_names are used
> to record symbol names themselves and index to symbol names respectively.
> For example:
> kallsyms_names:
> ...
> __amd_smn_rw._entry <== seq 1000
> __amd_smn_rw._entry.5 <== seq 1001
> __amd_smn_rw.llvm.<hash> <== seq 1002
> ...
>
> kallsyms_seqs_of_names are sorted based on cleanup_symbol_name() through, so
> the order in kallsyms_seqs_of_names actually has
>
> index 1000: seq 1002 <== __amd_smn_rw.llvm.<hash> (actual symbol comparison using '__amd_smn_rw')
> index 1001: seq 1000 <== __amd_smn_rw._entry
> index 1002: seq 1001 <== __amd_smn_rw._entry.5
>
> Let us say at a particular point, at index 1000, symbol '__amd_smn_rw.llvm.<hash>'
> is comparing to '__amd_smn_rw._entry' where '__amd_smn_rw._entry' is the one to
> search e.g., with function kallsyms_on_each_match_symbol(). The current implementation
> will find out '__amd_smn_rw._entry' is less than '__amd_smn_rw.llvm.<hash>' and
> then continue to search e.g., index 999 and never found a match although the actual
> index 1001 is a match.
>
> To fix this issue, let us do cleanup_symbol_name() first and then do comparison.
> In the above case, comparing '__amd_smn_rw' vs '__amd_smn_rw._entry' and
> '__amd_smn_rw._entry' being greater than '__amd_smn_rw', the next comparison will
> be > index 1000 and eventually index 1001 will be hit an a match is found.
>
> For any symbols not having '.llvm.' substr, there is no functionality change
> for compare_symbol_name().
>
> Fixes: 8cc32a9bbf29 ("kallsyms: strip LTO-only suffixes from promoted global functions")
> Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> Closes: https://lore.kernel.org/oe-lkp/202308232200.1c932a90-oliver.sang@xxxxxxxxx
> Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>

Reviewed-by: Song Liu <song@xxxxxxxxxx>