Re: [RFC] ftrace: Add support to keep some functions out of ftrace

From: Chen Zhongjin
Date: Sun Aug 14 2022 - 22:08:07 EST



On 2022/8/14 23:22, Jiri Olsa wrote:
On Sat, Aug 13, 2022 at 03:02:52PM -0400, Steven Rostedt wrote:
On Fri, 12 Aug 2022 23:18:15 +0200
Jiri Olsa <olsajiri@xxxxxxxxx> wrote:

the patch below moves the bpf function into sepatate object and switches
off the -mrecord-mcount for it.. so the function gets profile call
generated but it's not visible to ftrace

this seems to work, but it depends on -mrecord-mcount support in gcc and
it's x86 specific... other archs seems to use -fpatchable-function-entry,
which does not seem to have option to omit symbol from being collected
to the section
As I stated. the __mcount_loc section was created by ftrace. It has
nothing to do with -fpatchable-function-entry. It's just that the archs
that use that, do not have a gcc that creates the __mcount_loc section.

disabling specific ftrace symbol with FTRACE_FL_DISABLED flag seems to
be easir and generic solution.. I'll send RFC for that
It's not easier.

Here, I have a POC for you and some more history.

The recordmcount.pl Perl script was the first to create the
__mcount_loc section in all objects that ftrace needed to hook to. It
did an objdump, found the locations of the calls to mcount, created
another file that had a section __mcount_loc that referenced all those
locations. Compiled and relinked that back into the original object.

This was performed on all object files during the build, and had an
impact on build times. But this is when I also created and introduced
"make localmodconfig", which shrunk the build times for many people, so
nobody noticed the build time increase! :-)

Then John Reiser sent me a patch that created recordmcount.c that did
the same work but directly modified the ELF object files without having
to run scripts. This got rid of that horrible overhead from the scripts.

Then, the gcc folks decided to be helpful here as well and created the
--mrecord-mcount option that would create the __mcount_loc section for
us, where we no longer needed the recordmcount scripts/C program. But
is not available across the board.

Today, objtool has also got involved, and added an "--mcount" option
that will create the section too.
I overlooked that objtool is involved as well,
will check on that

objtool --mcount option only involves mcount_loc generation (see annotate_call_site) and other validation check call destination directly (see is_fentry_call).

Some simply removing --mcount option dose work for this.


Another question, it seems we can export and use DEFINE_BPF_DISPATCHER out of kernel, does that means we should add NO_MCOUNT_FILES for these single uages as well?

I dont think it can be made automatically. If ignored, this can be a trouble.


Best,

Chen

Below is a patch that extends yours by adding a NO_MCOUNT_FILES list,
that you add the object file to and it will prevent the other methods
from adding an mcount_loc location.
thanks,
jirka

I'm adding the objtool folks to make sure this is fine with them.
Again, this is a proof of concept, but works. It may need to be cleaned
a bit before it is final.

-- Steve

Index: linux-trace.git/scripts/Makefile.build
===================================================================
--- linux-trace.git.orig/scripts/Makefile.build
+++ linux-trace.git/scripts/Makefile.build
@@ -206,8 +206,10 @@ sub_cmd_record_mcount = perl $(srctree)/
"$(if $(part-of-module),1,0)" "$(@)";
recordmcount_source := $(srctree)/scripts/recordmcount.pl
endif # BUILD_C_RECORDMCOUNT
-cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \
+chk_sub_cmd_record_mcount = $(if $(filter $(shell basename $@),$(NO_MCOUNT_FILES)),, \
$(sub_cmd_record_mcount))
+cmd_record_mcount = $(if $(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags)), \
+ $(chk_sub_cmd_record_mcount))
endif # CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
# 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
Index: linux-trace.git/scripts/Makefile.lib
===================================================================
--- linux-trace.git.orig/scripts/Makefile.lib
+++ linux-trace.git/scripts/Makefile.lib
@@ -233,7 +233,8 @@ objtool_args = \
$(if $(CONFIG_HAVE_JUMP_LABEL_HACK), --hacks=jump_label) \
$(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=noinstr) \
$(if $(CONFIG_X86_KERNEL_IBT), --ibt) \
- $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \
+ $(if $(filter $(shell basename $@),$(NO_MCOUNT_FILES)),, \
+ $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount)) \
$(if $(CONFIG_UNWINDER_ORC), --orc) \
$(if $(CONFIG_RETPOLINE), --retpoline) \
$(if $(CONFIG_SLS), --sls) \
Index: linux-trace.git/net/core/Makefile
===================================================================
--- linux-trace.git.orig/net/core/Makefile
+++ linux-trace.git/net/core/Makefile
@@ -11,10 +11,15 @@ obj-$(CONFIG_SYSCTL) += sysctl_net_core.
obj-y += dev.o dev_addr_lists.o dst.o netevent.o \
neighbour.o rtnetlink.o utils.o link_watch.o filter.o \
sock_diag.o dev_ioctl.o tso.o sock_reuseport.o \
- fib_notifier.o xdp.o flow_offload.o gro.o
+ fib_notifier.o xdp.o flow_offload.o gro.o \
+ dispatcher.o
obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o
+# remove dispatcher function from ftrace sight
+CFLAGS_REMOVE_dispatcher.o = -mrecord-mcount
+NO_MCOUNT_FILES += dispatcher.o
+
obj-y += net-sysfs.o
obj-$(CONFIG_PAGE_POOL) += page_pool.o
obj-$(CONFIG_PROC_FS) += net-procfs.o
Index: linux-trace.git/net/core/dispatcher.c
===================================================================
--- /dev/null
+++ linux-trace.git/net/core/dispatcher.c
@@ -0,0 +1,3 @@
+#include <linux/bpf.h>
+
+DEFINE_BPF_DISPATCHER(xdp)
Index: linux-trace.git/net/core/filter.c
===================================================================
--- linux-trace.git.orig/net/core/filter.c
+++ linux-trace.git/net/core/filter.c
@@ -11162,8 +11162,6 @@ const struct bpf_verifier_ops sk_lookup_
#endif /* CONFIG_INET */
-DEFINE_BPF_DISPATCHER(xdp)
-
void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog)
{
bpf_dispatcher_change_prog(BPF_DISPATCHER_PTR(xdp), prev_prog, prog);