Re: [PATCH 0/3] mm: cma: /proc/cmainfo

From: Steven Rostedt
Date: Fri Jan 09 2015 - 09:19:17 EST



Wow, too much work over the holidays ;-)


On Tue, 30 Dec 2014 13:47:26 +0900
Minchan Kim <minchan@xxxxxxxxxx> wrote:


> Then, I don't think we could keep all of allocations. What we need
> is only slow allocations. I hope we can do that with ftrace.
>
> ex)
>
> # cd /sys/kernel/debug/tracing
> # echo 1 > options/stacktrace
> # echo cam_alloc > set_ftrace_filter
> # echo your_threshold > tracing_thresh
>
> I know it doesn't work now but I think it's more flexible
> and general way to handle such issues(ie, latency of some functions).
> So, I hope we could enhance ftrace rather than new wheel.
> Ccing ftrace people.
>

I've been working on trace-cmd this month and came up with a new
"profile" command. I don't have cma_alloc but doing something like this
with kmalloc.


# trace-cmd profile -S -p function_graph -l __kmalloc -l '__kmalloc:stacktrace' --stderr workload 2>profile.out

and this gives me in profile.out, something like this:

------
CPU: 0
entries: 0
overrun: 0
commit overrun: 0
bytes: 3560
oldest event ts: 349.925480
now ts: 356.910819
dropped events: 0
read events: 36

CPU: 1
entries: 0
overrun: 0
commit overrun: 0
bytes: 408
oldest event ts: 354.610624
now ts: 356.910838
dropped events: 0
read events: 48

CPU: 2
entries: 0
overrun: 0
commit overrun: 0
bytes: 3184
oldest event ts: 356.761870
now ts: 356.910854
dropped events: 0
read events: 1830

CPU: 3
entries: 6
overrun: 0
commit overrun: 0
bytes: 2664
oldest event ts: 356.440675
now ts: 356.910875
dropped events: 0
read events: 717

[...]

task: <...>-2880
Event: func: __kmalloc() (74) Total: 53254 Avg: 719 Max: 1095 Min:481
|
+ ftrace_ops_list_func (0xffffffff810c229e)
100% (74) time:53254 max:1095 min:481 avg:719
ftrace_call (0xffffffff81526047)
trace_preempt_on (0xffffffff810d28ff)
preempt_count_sub (0xffffffff81061c62)
__mutex_lock_slowpath (0xffffffff81522807)
__kmalloc (0xffffffff811323f3)
__kmalloc (0xffffffff811323f3)
tracing_buffers_splice_read (0xffffffff810ca23e)
|
+ set_next_entity (0xffffffff81067027)
| 66% (49) time:34925 max:1044 min:481 avg:712
| __switch_to (0xffffffff810016d7)
| trace_hardirqs_on (0xffffffff810d28db)
| _raw_spin_unlock_irq (0xffffffff81523a8e)
| trace_preempt_on (0xffffffff810d28ff)
| preempt_count_sub (0xffffffff81061c62)
| __schedule (0xffffffff815204d3)
| trace_preempt_on (0xffffffff810d28ff)
| buffer_spd_release (0xffffffff810c91fd)
| SyS_splice (0xffffffff8115dccf)
| system_call_fastpath (0xffffffff81523f92)
|
+ do_read_fault.isra.74 (0xffffffff8111431d)
| 24% (18) time:12654 max:1008 min:481 avg:703
| |
| + select_task_rq_fair (0xffffffff81067806)
| | 89% (16) time:11234 max:1008 min:481 avg:702
| | trace_preempt_on (0xffffffff810d28ff)
| | buffer_spd_release (0xffffffff810c91fd)
| | SyS_splice (0xffffffff8115dccf)
| | system_call_fastpath (0xffffffff81523f92)
| |
| + handle_mm_fault (0xffffffff81114df4)
| 11% (2) time:1420 max:879 min:541 avg:710
| trace_preempt_on (0xffffffff810d28ff)
| buffer_spd_release (0xffffffff810c91fd)
| SyS_splice (0xffffffff8115dccf)
| system_call_fastpath (0xffffffff81523f92)
|
|
|
+ update_stats_wait_end (0xffffffff81066c5c)
| 6% (4) time:3153 max:1095 min:635 avg:788
| set_next_entity (0xffffffff81067027)
| __switch_to (0xffffffff810016d7)
| trace_hardirqs_on (0xffffffff810d28db)
| _raw_spin_unlock_irq (0xffffffff81523a8e)
| trace_preempt_on (0xffffffff810d28ff)
| preempt_count_sub (0xffffffff81061c62)
| __schedule (0xffffffff815204d3)
| trace_preempt_on (0xffffffff810d28ff)
| buffer_spd_release (0xffffffff810c91fd)
| SyS_splice (0xffffffff8115dccf)
| system_call_fastpath (0xffffffff81523f92)
|
+ _raw_spin_unlock (0xffffffff81523af5)
| 3% (2) time:1854 max:936 min:918 avg:927
| do_read_fault.isra.74 (0xffffffff8111431d)
| handle_mm_fault (0xffffffff81114df4)
| buffer_spd_release (0xffffffff810c91fd)
| SyS_splice (0xffffffff8115dccf)
| system_call_fastpath (0xffffffff81523f92)
|
+ trace_hardirqs_off (0xffffffff810d2891)
1% (1) time:668 max:668 min:668 avg:668
kmem_cache_free (0xffffffff81130e48)
__dequeue_signal (0xffffffff8104c802)
trace_preempt_on (0xffffffff810d28ff)
preempt_count_sub (0xffffffff81061c62)
_raw_spin_unlock_irq (0xffffffff81523a8e)
recalc_sigpending (0xffffffff8104c5d1)
__set_task_blocked (0xffffffff8104cd2e)
trace_preempt_on (0xffffffff810d28ff)
preempt_count_sub (0xffffffff81061c62)
preempt_count_sub (0xffffffff81061c62)
buffer_spd_release (0xffffffff810c91fd)
SyS_splice (0xffffffff8115dccf)
system_call_fastpath (0xffffffff81523f92)

If you want better names, I would add "-e sched_switch", as that will
record the comms of the tasks and you don't end up with a bunch of
"<...>".

Is this something you are looking for. The profile command does not
save to disk, thus it does the analysis live, and you don't need to
worry about running out of disk space. Although, since it is live, it
may tend to drop more events (see the "overrun values").

You can get trace-cmd from:

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git

You'll need the latest from the master branch, as even 2.5 doesn't have
the --stderr yet.

Make sure to do a make install and make install_doc, then you can do:

man trace-cmd-record
man trace-cmd-profile

to read about all the options.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/