Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper

From: Yonghong Song
Date: Fri Nov 27 2020 - 11:09:48 EST




On 11/27/20 3:20 AM, KP Singh wrote:
On Fri, Nov 27, 2020 at 8:35 AM Yonghong Song <yhs@xxxxxx> wrote:



On 11/26/20 8:57 AM, Florent Revest wrote:
This helper exposes the kallsyms_lookup function to eBPF tracing
programs. This can be used to retrieve the name of the symbol at an
address. For example, when hooking into nf_register_net_hook, one can
audit the name of the registered netfilter hook and potentially also
the name of the module in which the symbol is located.

Signed-off-by: Florent Revest <revest@xxxxxxxxxx>
---
include/uapi/linux/bpf.h | 16 +++++++++++++
kernel/trace/bpf_trace.c | 41 ++++++++++++++++++++++++++++++++++
tools/include/uapi/linux/bpf.h | 16 +++++++++++++
3 files changed, 73 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c3458ec1f30a..670998635eac 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3817,6 +3817,21 @@ union bpf_attr {
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if IMA is disabled or **-EINVAL** if
* invalid arguments are passed.
+ *
+ * long bpf_kallsyms_lookup(u64 address, char *symbol, u32 symbol_size, char *module, u32 module_size)
+ * Description
+ * Uses kallsyms to write the name of the symbol at *address*
+ * into *symbol* of size *symbol_sz*. This is guaranteed to be
+ * zero terminated.
+ * If the symbol is in a module, up to *module_size* bytes of
+ * the module name is written in *module*. This is also
+ * guaranteed to be zero-terminated. Note: a module name
+ * is always shorter than 64 bytes.
+ * Return
+ * On success, the strictly positive length of the full symbol
+ * name, If this is greater than *symbol_size*, the written
+ * symbol is truncated.
+ * On error, a negative value.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -3981,6 +3996,7 @@ union bpf_attr {
FN(bprm_opts_set), \
FN(ktime_get_coarse_ns), \
FN(ima_inode_hash), \
+ FN(kallsyms_lookup), \
/* */

/* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index d255bc9b2bfa..9d86e20c2b13 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -17,6 +17,7 @@
#include <linux/error-injection.h>
#include <linux/btf_ids.h>
#include <linux/bpf_lsm.h>
+#include <linux/kallsyms.h>

#include <net/bpf_sk_storage.h>

@@ -1260,6 +1261,44 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = {
.arg5_type = ARG_ANYTHING,
};

+BPF_CALL_5(bpf_kallsyms_lookup, u64, address, char *, symbol, u32, symbol_size,
+ char *, module, u32, module_size)
+{
+ char buffer[KSYM_SYMBOL_LEN];
+ unsigned long offset, size;
+ const char *name;
+ char *modname;
+ long ret;
+
+ name = kallsyms_lookup(address, &size, &offset, &modname, buffer);
+ if (!name)
+ return -EINVAL;
+
+ ret = strlen(name) + 1;
+ if (symbol_size) {
+ strncpy(symbol, name, symbol_size);
+ symbol[symbol_size - 1] = '\0';
+ }
+
+ if (modname && module_size) {
+ strncpy(module, modname, module_size);
+ module[module_size - 1] = '\0';

In this case, module name may be truncated and user did not get any
indication from return value. In the helper description, it is mentioned
that module name currently is most 64 bytes. But from UAPI perspective,
it may be still good to return something to let user know the name
is truncated.

I do not know what is the best way to do this. One suggestion is
to break it into two helpers, one for symbol name and another

I think it would be slightly preferable to have one helper though.
maybe something like bpf_get_symbol_info (better names anyone? :))
with flags to get the module name or the symbol name depending
on the flag?

This works even better. Previously I am thinking if we have two helpers,
we can add flags for each of them for future extension. But we
can certainly have just one helper with flags to indicate
whether this is for module name or for symbol name or something else.

The buffer can be something like
union bpf_ksymbol_info {
char module_name[];
char symbol_name[];
...
}
and flags will indicate what information user wants.


for module name. What is the use cases people want to get both
symbol name and module name and is it common?

The use case would be to disambiguate symbols in the
kernel from the ones from a kernel module. Similar to what
/proc/kallsyms does:

T cpufreq_gov_powersave_init [cpufreq_powersave]


+ }
+
+ return ret;
+}
+
+const struct bpf_func_proto bpf_kallsyms_lookup_proto = {
+ .func = bpf_kallsyms_lookup,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_ANYTHING,
+ .arg2_type = ARG_PTR_TO_MEM,
ARG_PTR_TO_UNINIT_MEM?

+ .arg3_type = ARG_CONST_SIZE,
ARG_CONST_SIZE_OR_ZERO? This is especially true for current format
which tries to return both symbol name and module name and
user may just want to do one of them.

+ .arg4_type = ARG_PTR_TO_MEM,
ARG_PTR_TO_UNINIT_MEM?

+ .arg5_type = ARG_CONST_SIZE,
ARG_CONST_SIZE_OR_ZERO?

+};
+
const struct bpf_func_proto *
bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
@@ -1356,6 +1395,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_per_cpu_ptr_proto;
case BPF_FUNC_bpf_this_cpu_ptr:
return &bpf_this_cpu_ptr_proto;
+ case BPF_FUNC_kallsyms_lookup:
+ return &bpf_kallsyms_lookup_proto;
default:
return NULL;
}
[...]