Re: [PATCH v2] libbpf: Improve version handling when attaching uprobe

From: Yonghong Song
Date: Mon May 01 2023 - 11:24:08 EST




On 5/1/23 6:00 AM, Espen Grindhaug wrote:
On Thu, Apr 27, 2023 at 06:19:29PM -0700, Yonghong Song wrote:


On 4/27/23 12:19 PM, Espen Grindhaug wrote:
On Wed, Apr 26, 2023 at 02:47:27PM -0700, Yonghong Song wrote:


On 4/23/23 11:55 AM, Espen Grindhaug wrote:
This change fixes the handling of versions in elf_find_func_offset.
In the previous implementation, we incorrectly assumed that the

Could you give more explanation/example in the commit message
what does 'incorrectly' mean here? In which situations the
current libbpf implementation will not be correct?


How about something like this?


libbpf: Improve version handling when attaching uprobe

This change fixes the handling of versions in elf_find_func_offset.

For example, let's assume we are trying to attach an uprobe to pthread_create in
glibc. Prior to this commit, it would fail with an error message saying 'elf:
ambiguous match [...]', this is because there are two entries in the symbol
table with that name.

$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
0000000000094cc0 T pthread_create@GLIBC_2.2.5
0000000000094cc0 T pthread_create@@GLIBC_2.34

So we go ahead and modify our code to attach to 'pthread_create@@GLIBC_2.34',
and this also fails, but this time with the error 'elf: failed to find symbol
[...]'. This fails because we incorrectly assumed that the version information
would be present in the string found in the string table, but there is only the
string 'pthread_create'.

I tried one example with my centos8 libpthread library.

$ llvm-readelf -s /lib64/libc-2.28.so | grep pthread_cond_signal
39: 0000000000095f70 43 FUNC GLOBAL DEFAULT 14
pthread_cond_signal@@GLIBC_2.3.2
40: 0000000000096250 43 FUNC GLOBAL DEFAULT 14
pthread_cond_signal@GLIBC_2.2.5
3160: 0000000000096250 43 FUNC LOCAL DEFAULT 14
__pthread_cond_signal_2_0
3589: 0000000000095f70 43 FUNC LOCAL DEFAULT 14
__pthread_cond_signal
5522: 0000000000095f70 43 FUNC GLOBAL DEFAULT 14
pthread_cond_signal@@GLIBC_2.3.2
5545: 0000000000096250 43 FUNC GLOBAL DEFAULT 14
pthread_cond_signal@GLIBC_2.2.5
$ nm -D /lib64/libc-2.28.so | grep pthread_cond_signal
0000000000095f70 T pthread_cond_signal@@GLIBC_2.3.2
0000000000096250 T pthread_cond_signal@GLIBC_2.2.5
$

Note that two pthread_cond_signal functions have different addresses,
which is expected as they implemented for different versions.

But in your case,
$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
0000000000094cc0 T pthread_create@GLIBC_2.2.5
0000000000094cc0 T pthread_create@@GLIBC_2.34

Two functions have the same address which is very weird and I suspect
some issues here at least needs some investigation.


I am no expert on this, but as far as I can tell, this is normal,
although much more common on my Ubuntu machine than my Fedora machine.

Script to find duplicates:

nm -D /usr/lib64/libc-2.33.so | awk '
{
addr = $1;
symbol = $3;
sub(/[@].*$/, "", symbol);

if (addr == prev_addr && symbol == prev_symbol) {
if (prev_symbol_printed == 0) {
print prev_line;
prev_symbol_printed = 1;
}
print;
} else {
prev_symbol_printed = 0;
}
prev_addr = addr;
prev_symbol = symbol;
prev_line = $0;
}'


Second, for the symbol table, the following is ELF encoding,

typedef struct {
Elf64_Word st_name;
unsigned char st_info;
unsigned char st_other;
Elf64_Half st_shndx;
Elf64_Addr st_value;
Elf64_Xword st_size;
} Elf64_Sym;

where
st_name

An index into the object file's symbol string table, which holds the
character representations of the symbol names. If the value is nonzero, the
value represents a string table index that gives the symbol name. Otherwise,
the symbol table entry has no name.

So, the function name (including @..., @@...) should be in string table
which is the same for the above two pthread_cond_signal symbols.

I think it is worthwhile to debug why in your situation
pthread_create@GLIBC_2.2.5 and pthread_create@@GLIBC_2.34 do not
have them in the string table.


I think you are mistaken here; the strings in the strings table don't contain
the version. Take a look at this partial dump of the strings table.

$ readelf -W -p .dynstr /usr/lib64/libc-2.33.so

String dump of section '.dynstr':
[ 1] xdrmem_create
[ f] __wctomb_chk
[ 1c] getmntent
[ 26] __freelocale
[ 33] __rawmemchr
[ 3f] _IO_vsprintf
[ 4c] getutent
[ 55] __file_change_detection_for_path
(...)
[ 350e] memrchr
[ 3516] pthread_cond_signal
[ 352a] __close
(...)
[ 61b6] GLIBC_2.2.5
[ 61c2] GLIBC_2.2.6
[ 61ce] GLIBC_2.3
[ 61d8] GLIBC_2.3.2
[ 61e4] GLIBC_2.3.3

As you can see, the strings have no versions, and the version strings
themselves are also in this table as entries at the end of the table.

I see you search .dynstr section. Do you think whether we should
search .strtab instead since it contains versioned symbols?



This patch reworks how we compare the symbol name provided by the user if it is
qualified with a version (using @ or @@). We now look up the correct version
string in the version symbol table before constructing the full name, as also
done above by nm, before comparing.

version information would be present in the string found in the
string table.

We now look up the correct version string in the version symbol
table before constructing the full name and then comparing.

This patch adds support for both name@version and name@@version to
match output of the various elf parsers.

Signed-off-by: Espen Grindhaug <espen.grindhaug@xxxxxxxxx>

[...]