Re: [PATCH 2/2] perf tools: Improve call graph documents and help messages

From: Namhyung Kim
Date: Thu Oct 22 2015 - 12:06:13 EST


On Thu, Oct 22, 2015 at 11:46 PM, Arnaldo Carvalho de Melo
<acme@xxxxxxxxxx> wrote:
> Em Thu, Oct 22, 2015 at 11:28:32PM +0900, Namhyung Kim escreveu:
>> The --call-graph option is complex so we should provide better guide for
>> users. Also change help message to be consistent with config option
>> names. Now perf top will show help like below:
>>
>> $ perf top --call-graph
>> Error: option `call-graph' requires a value
>>
>> Usage: perf top [<options>]
>>
>> --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
>> setup and enables call-graph (stack chain/backtrace):
>>
>> record_mode: call graph recording mode (fp|dwarf|lbr)
>> record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>)
>> default: 8192 (bytes)
>> print_type: call graph printing style (graph|flat|fractal|none)
>> threshold: minimum call graph inclusion threshold (<percent>)
>> print_limit: maximum number of call graph entry (<number>)
>> order: call graph order (caller|callee)
>> sort_key: call graph sort key (function|address)
>> branch: include last branch info to call graph (branch)
>>
>> Default: fp,graph,0.5,caller,function
>
> At some point it would be nice to be able to use:
>
> perf top --call-graph caller
>
> And have that be equivalent to:
>
> perf top --callgraph fp,graph,0.5,caller,function
>
> I.e. change just one of the defaults.


?? That's supported already by commit e8232f1ad468 ("perf report:
Relax -g option parsing not to limit the option order")..

>
> But I think that how you did it is backwards compatible, i.e. adding
> stuff just to the end.
>
> Ah, noticed you forgot to update the 'top' man page

Heh, I just added below in the previous patch.. ;-)

"See `--call-graph` section in perf-record and perf-report man pages
for details."

Thanks,
Namhyung


/me goes to bed now...

>
>>
>> Requested-by: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
>> Cc: Borislav Petkov <bp@xxxxxxx>
>> Cc: Brendan Gregg <brendan.d.gregg@xxxxxxxxx>
>> Cc: Chandler Carruth <chandlerc@xxxxxxxxx>
>> Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
>> Cc: Stephane Eranian <eranian@xxxxxxxxxx>
>> Cc: Wang Nan <wangnan0@xxxxxxxxxx>
>> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
>> ---
>> tools/perf/Documentation/perf-record.txt | 9 ++++++--
>> tools/perf/Documentation/perf-report.txt | 38 ++++++++++++++++++++------------
>> tools/perf/builtin-record.c | 5 +++--
>> tools/perf/builtin-report.c | 11 +++++----
>> tools/perf/builtin-top.c | 5 +++--
>> tools/perf/util/callchain.h | 24 +++++++++++++++-----
>> 6 files changed, 62 insertions(+), 30 deletions(-)
>>
>> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
>> index b027d28658f2..7ff6a9d0ea0d 100644
>> --- a/tools/perf/Documentation/perf-record.txt
>> +++ b/tools/perf/Documentation/perf-record.txt
>> @@ -144,7 +144,7 @@ OPTIONS
>>
>> --call-graph::
>> Setup and enable call-graph (stack chain/backtrace) recording,
>> - implies -g.
>> + implies -g. Default is "fp".
>>
>> Allows specifying "fp" (frame pointer) or "dwarf"
>> (DWARF's CFI - Call Frame Information) or "lbr"
>> @@ -154,13 +154,18 @@ OPTIONS
>> In some systems, where binaries are build with gcc
>> --fomit-frame-pointer, using the "fp" method will produce bogus
>> call graphs, using "dwarf", if available (perf tools linked to
>> - the libunwind library) should be used instead.
>> + the libunwind or libdw library) should be used instead.
>> Using the "lbr" method doesn't require any compiler options. It
>> will produce call graphs from the hardware LBR registers. The
>> main limition is that it is only available on new Intel
>> platforms, such as Haswell. It can only get user call chain. It
>> doesn't work with branch stack sampling at the same time.
>>
>> + When "dwarf" recording is used, perf also records (user) stack dump
>> + when sampled. Default size of the stack dump is 8192 (bytes).
>> + User can change the size by passing the size after comma like
>> + "--call-graph dwarf,4096".
>> +
>> -q::
>> --quiet::
>> Don't print any message, useful for scripting.
>> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
>> index e4fdeeb51123..ab1fd64e3627 100644
>> --- a/tools/perf/Documentation/perf-report.txt
>> +++ b/tools/perf/Documentation/perf-report.txt
>> @@ -169,30 +169,40 @@ OPTIONS
>> --dump-raw-trace::
>> Dump raw trace in ASCII.
>>
>> --g [type,min[,limit],order[,key][,branch]]::
>> ---call-graph::
>> - Display call chains using type, min percent threshold, optional print
>> - limit and order.
>> - type can be either:
>> +-g::
>> +--call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>::
>> + Display call chains using type, min percent threshold, print limit,
>> + call order, sort key and branch. Note that ordering of parameters is not
>> + fixed so any parement can be given in an arbitraty order. One exception
>> + is the print_limit which should be preceded by threshold.
>> +
>> + print_type can be either:
>> - flat: single column, linear exposure of call chains.
>> - - graph: use a graph tree, displaying absolute overhead rates.
>> + - graph: use a graph tree, displaying absolute overhead rates. (default)
>> - fractal: like graph, but displays relative rates. Each branch of
>> - the tree is considered as a new profiled object. +
>> + the tree is considered as a new profiled object.
>> + - none: disable call chain display.
>> +
>> + threshold is a percentage value which specifies a minimum percent to be
>> + included in the output call graph. Default is 0.5 (%).
>> +
>> + print_limit is only applied when stdio interface is used. It's to limit
>> + number of call graph entries in a single hist entry. Note that it needs
>> + to be given after threshold (but not necessarily consecutive).
>> + Default is 0 (unlimited).
>>
>> order can be either:
>> - callee: callee based call graph.
>> - caller: inverted caller based call graph.
>> + Default is 'caller' when --children is used, otherwise 'callee'.
>>
>> - key can be:
>> - - function: compare on functions
>> + sort_key can be:
>> + - function: compare on functions (default)
>> - address: compare on individual code addresses
>>
>> branch can be:
>> - - branch: include last branch information in callgraph
>> - when available. Usually more convenient to use --branch-history
>> - for this.
>> -
>> - Default: graph,0.5,caller
>> + - branch: include last branch information in callgraph when available.
>> + Usually more convenient to use --branch-history for this.
>>
>> --children::
>> Accumulate callchain of children to parent entry so that then can
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 1a117623d396..2740d7a82ae8 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -1010,7 +1010,8 @@ static struct record record = {
>> },
>> };
>>
>> -const char record_callchain_help[] = CALLCHAIN_RECORD_HELP;
>> +const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
>> + "\n\t\t\t\tDefault: fp";
>>
>> /*
>> * XXX Will stay a global variable till we fix builtin-script.c to stop messing
>> @@ -1058,7 +1059,7 @@ struct option __record_options[] = {
>> NULL, "enables call-graph recording" ,
>> &record_callchain_opt),
>> OPT_CALLBACK(0, "call-graph", &record.opts,
>> - "mode[,dump_size]", record_callchain_help,
>> + "record_mode[,record_size]", record_callchain_help,
>> &record_parse_callchain_opt),
>> OPT_INCR('v', "verbose", &verbose,
>> "be more verbose (show counter open errors, etc)"),
>> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
>> index 545c51cef7f7..50dd4d3d8667 100644
>> --- a/tools/perf/builtin-report.c
>> +++ b/tools/perf/builtin-report.c
>> @@ -625,8 +625,11 @@ parse_percent_limit(const struct option *opt, const char *str,
>> return 0;
>> }
>>
>> -const char report_callchain_help[] = "Display callchains using " CALLCHAIN_REPORT_HELP ". "
>> - "Default: graph,0.5,caller";
>> +#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function"
>> +
>> +const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
>> + CALLCHAIN_REPORT_HELP
>> + "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
>>
>> int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
>> {
>> @@ -636,7 +639,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
>> bool has_br_stack = false;
>> int branch_mode = -1;
>> bool branch_call_mode = false;
>> - char callchain_default_opt[] = "graph,0.5,caller";
>> + char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
>> const char * const report_usage[] = {
>> "perf report [<options>]",
>> NULL
>> @@ -703,7 +706,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
>> OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
>> "Only display entries with parent-match"),
>> OPT_CALLBACK_DEFAULT('g', "call-graph", &report,
>> - "output_type,min_percent[,print_limit],call_order[,branch]",
>> + "print_type,threshold[,print_limit],order,sort_key[,branch]",
>> report_callchain_help, &report_parse_callchain_opt,
>> callchain_default_opt),
>> OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
>> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
>> index af849b1d7389..7e2e72e6d9d1 100644
>> --- a/tools/perf/builtin-top.c
>> +++ b/tools/perf/builtin-top.c
>> @@ -1093,7 +1093,8 @@ parse_percent_limit(const struct option *opt, const char *arg,
>> return 0;
>> }
>>
>> -const char top_callchain_help[] = CALLCHAIN_RECORD_HELP ", " CALLCHAIN_REPORT_HELP;
>> +const char top_callchain_help[] = CALLCHAIN_RECORD_HELP CALLCHAIN_REPORT_HELP
>> + "\n\t\t\t\tDefault: fp,graph,0.5,caller,function";
>>
>> int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>> {
>> @@ -1173,7 +1174,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>> NULL, "enables call-graph recording and display",
>> &callchain_opt),
>> OPT_CALLBACK(0, "call-graph", &top.record_opts,
>> - "mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]",
>> + "record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]",
>> top_callchain_help, &parse_callchain_opt),
>> OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
>> "Accumulate callchains of children and show total overhead as well"),
>> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
>> index aaf467c9ef2b..fce8161e54db 100644
>> --- a/tools/perf/util/callchain.h
>> +++ b/tools/perf/util/callchain.h
>> @@ -7,17 +7,29 @@
>> #include "event.h"
>> #include "symbol.h"
>>
>> -#define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace) recording: "
>> +#define HELP_PAD "\t\t\t\t"
>> +
>> +#define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace):\n\n"
>>
>> #ifdef HAVE_DWARF_UNWIND_SUPPORT
>> -#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP "fp dwarf lbr"
>> +# define RECORD_MODE_HELP HELP_PAD "record_mode:\tcall graph recording mode (fp|dwarf|lbr)\n"
>> #else
>> -#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP "fp lbr"
>> +# define RECORD_MODE_HELP HELP_PAD "record_mode:\tcall graph recording mode (fp|lbr)\n"
>> #endif
>>
>> -#define CALLCHAIN_REPORT_HELP "output_type (graph, flat, fractal, or none), " \
>> - "min percent threshold, optional print limit, callchain order, " \
>> - "key (function or address), add branches"
>> +#define RECORD_SIZE_HELP \
>> + HELP_PAD "record_size:\tif record_mode is 'dwarf', max size of stack recording (<bytes>)\n" \
>> + HELP_PAD "\t\tdefault: 8192 (bytes)\n"
>> +
>> +#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP RECORD_MODE_HELP RECORD_SIZE_HELP
>> +
>> +#define CALLCHAIN_REPORT_HELP \
>> + HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|none)\n" \
>> + HELP_PAD "threshold:\tminimum call graph inclusion threshold (<percent>)\n" \
>> + HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
>> + HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
>> + HELP_PAD "sort_key:\tcall graph sort key (function|address)\n" \
>> + HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n"
>>
>> enum perf_call_graph_mode {
>> CALLCHAIN_NONE,
>> --
>> 2.6.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/