Re: [PATCH] trace: Add trace any kernel object

From: Huan Xie
Date: Wed Oct 20 2021 - 12:04:56 EST


On Fri, Oct 15, 2021 at 9:08 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> On Wed, 13 Oct 2021 22:20:56 -0400
> Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> > [ Added Masami ]
> >
> > On Thu, 14 Oct 2021 09:44:45 +0800
> > Jeff Xie <xiehuan09@xxxxxxxxx> wrote:
> >
> > > Introduce a method based on function tracer and kprobe event
> > > to trace any object in the linux kernel.
> >
> > I first want to say that this looks really interesting, and something
> > that can help out a lot!
> >
> > That said, we are going to have to work with you to come up with a
> > better and more flexible interface, and make sure locking and
> > synchronization works.
>
> Agreed. This is really interesting. I used to try to add hwbreak event
> but the problem is how to set up the hw breakpoint when I found the
> interesting one dynamically from kprobes.
>
> > >
> > > Usage:
> > > When using the kprobe event, only need to enable trace_object,
> > > we can trace any function argument or the return value. When
> > > the object passes through a function, it can be traced.
>
> With this idea, we don't have to have a hwbreak but just focus on
> the parameters etc.
>
> > >
> > > For example:
> > >
> > > For the function bio_add_page, we can trace the first argument:
> > >
> > > int bio_add_page(struct bio *bio, struct page *page,
> > > unsigned int len, unsigned int offset)
> > >
> > > #cd /sys/kernel/debug/tracing
> > > #echo 1 > ./trace_object
>
> Hmm, so what happen if there are several events in parallel?
>
> > > #echo 'p bio_add_page arg1=$arg1' > kprobe_events
> > > #echo 1 > ./events/kprobes/p_bio_add_page_0/enable
>
> what about adding a "objfilter" trigger action to update the object
> list? e.g.
>
> echo objfilter:add:arg1:1 if comm == "sync" >> ./events/kprobes/p_bio_add_page_0/trigger
>
> This will "add the value of 'arg1' to objfilter 1 time if the process
> name is sync".

Thanks. The trigger seems to work in combination with kprobe/uprobe/eprobe.

I have been trying to implement this feature in recent days, But my
code does not look very elegant at the moment :-)

It seems the result of 'cat trace' should also need to be filtered,
Because it will output the functions we don’t want to care about.

I didn't expect this idea to be a relatively large project. :-)

>
> And maybe before that, we need a "objtrace" tracer because
> as far as I can see, the trace output below is not compatible
> with function/fgraph tracer.
>
> Anyway, this is a great work!
>
> Thank you!
>
> > > #sync
> > > #cat ./trace
> > >
> > > # tracer: nop
> > > #
> > > # entries-in-buffer/entries-written: 367/367 #P:4
> > > #
> > > # _-----=> irqs-off
> > > # / _----=> need-resched
> > > # | / _---=> hardirq/softirq
> > > # || / _--=> preempt-depth
> > > # ||| / _-=> migrate-disable
> > > # |||| / delay
> > > # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> > > # | | | ||||| | |
> > > kworker/u8:5-89 [000] ..... 60.310385: p_bio_add_page_0: (bio_add_page+0x0/0x90) arg1=0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310386: bio_add_page <-ext4_bio_write_page object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310387: __bio_try_merge_page <-bio_add_page object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310387: __bio_add_page <-bio_add_page object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310389: submit_bio <-ext4_io_submit object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310389: submit_bio <-ext4_io_submit object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310389: submit_bio_noacct <-ext4_io_submit object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310389: submit_bio_checks <-submit_bio_noacct object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310390: __cond_resched <-submit_bio_checks object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310390: rcu_all_qs <-__cond_resched object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310390: should_fail_bio <-submit_bio_checks object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310391: blk_mq_submit_bio <-submit_bio_noacct object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310392: blk_attempt_plug_merge <-blk_mq_submit_bio object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310392: __blk_mq_sched_bio_merge <-blk_mq_submit_bio object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310392: __blk_mq_alloc_request <-blk_mq_submit_bio object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310393: blk_mq_get_tag <-__blk_mq_alloc_request object:0xffff888102bde780
> > > kworker/u8:5-89 [000] ..... 60.310394: blk_account_io_start <-blk_mq_submit_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311264: bio_advance <-blk_update_request object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311265: bio_endio <-blk_update_request object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311265: ext4_end_bio <-blk_update_request object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311266: ext4_finish_bio <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311274: bio_put <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311274: bio_free <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311274: bio_free <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311274: bvec_free <-bio_free object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311275: mempool_free <-bio_free object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311275: mempool_free_slab <-bio_free object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311275: mempool_free <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311275: mempool_free_slab <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311275: kmem_cache_free <-ext4_end_bio object:0xffff888102bde780
> > > <idle>-0 [000] d.... 60.311276: kmem_cache_free <-ext4_end_bio object:0xffff888102bde780
> > > ...
> > >
> >
> > So if I understand the above, if you set this "trace_object" then
> > whatever gets placed into a kprobe parameter ($arg1?) or a return value
> > of a kretprobe, gets traced at every other function call?
> >
> >
> > > Signed-off-by: Jeff Xie <xiehuan09@xxxxxxxxx>
> > > ---
> > > include/linux/ftrace.h | 6 ++
> > > kernel/trace/Kconfig | 7 ++
> > > kernel/trace/Makefile | 1 +
> > > kernel/trace/trace.c | 69 +++++++++++++++++++
> > > kernel/trace/trace.h | 2 +
> > > kernel/trace/trace_entries.h | 17 +++++
> > > kernel/trace/trace_kprobe.c | 6 ++
> > > kernel/trace/trace_object.c | 128 +++++++++++++++++++++++++++++++++++
> > > kernel/trace/trace_output.c | 40 +++++++++++
> > > 9 files changed, 276 insertions(+)
> > > create mode 100644 kernel/trace/trace_object.c
> > >
>
> --
> Masami Hiramatsu <mhiramat@xxxxxxxxxx>

--
JeffXie