Re: [PATCH 2/2] perf tools: Improve call graph documents and help messages

From: Arnaldo Carvalho de Melo
Date: Thu Oct 22 2015 - 10:46:53 EST


Em Thu, Oct 22, 2015 at 11:28:32PM +0900, Namhyung Kim escreveu:
> The --call-graph option is complex so we should provide better guide for
> users. Also change help message to be consistent with config option
> names. Now perf top will show help like below:
>
> $ perf top --call-graph
> Error: option `call-graph' requires a value
>
> Usage: perf top [<options>]
>
> --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
> setup and enables call-graph (stack chain/backtrace):
>
> record_mode: call graph recording mode (fp|dwarf|lbr)
> record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>)
> default: 8192 (bytes)
> print_type: call graph printing style (graph|flat|fractal|none)
> threshold: minimum call graph inclusion threshold (<percent>)
> print_limit: maximum number of call graph entry (<number>)
> order: call graph order (caller|callee)
> sort_key: call graph sort key (function|address)
> branch: include last branch info to call graph (branch)
>
> Default: fp,graph,0.5,caller,function

At some point it would be nice to be able to use:

perf top --call-graph caller

And have that be equivalent to:

perf top --callgraph fp,graph,0.5,caller,function

I.e. change just one of the defaults.

But I think that how you did it is backwards compatible, i.e. adding
stuff just to the end.

Ah, noticed you forgot to update the 'top' man page

- Arnaldo

>
> Requested-by: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxx>
> Cc: Brendan Gregg <brendan.d.gregg@xxxxxxxxx>
> Cc: Chandler Carruth <chandlerc@xxxxxxxxx>
> Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> Cc: Stephane Eranian <eranian@xxxxxxxxxx>
> Cc: Wang Nan <wangnan0@xxxxxxxxxx>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/Documentation/perf-record.txt | 9 ++++++--
> tools/perf/Documentation/perf-report.txt | 38 ++++++++++++++++++++------------
> tools/perf/builtin-record.c | 5 +++--
> tools/perf/builtin-report.c | 11 +++++----
> tools/perf/builtin-top.c | 5 +++--
> tools/perf/util/callchain.h | 24 +++++++++++++++-----
> 6 files changed, 62 insertions(+), 30 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index b027d28658f2..7ff6a9d0ea0d 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -144,7 +144,7 @@ OPTIONS
>
> --call-graph::
> Setup and enable call-graph (stack chain/backtrace) recording,
> - implies -g.
> + implies -g. Default is "fp".
>
> Allows specifying "fp" (frame pointer) or "dwarf"
> (DWARF's CFI - Call Frame Information) or "lbr"
> @@ -154,13 +154,18 @@ OPTIONS
> In some systems, where binaries are build with gcc
> --fomit-frame-pointer, using the "fp" method will produce bogus
> call graphs, using "dwarf", if available (perf tools linked to
> - the libunwind library) should be used instead.
> + the libunwind or libdw library) should be used instead.
> Using the "lbr" method doesn't require any compiler options. It
> will produce call graphs from the hardware LBR registers. The
> main limition is that it is only available on new Intel
> platforms, such as Haswell. It can only get user call chain. It
> doesn't work with branch stack sampling at the same time.
>
> + When "dwarf" recording is used, perf also records (user) stack dump
> + when sampled. Default size of the stack dump is 8192 (bytes).
> + User can change the size by passing the size after comma like
> + "--call-graph dwarf,4096".
> +
> -q::
> --quiet::
> Don't print any message, useful for scripting.
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index e4fdeeb51123..ab1fd64e3627 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -169,30 +169,40 @@ OPTIONS
> --dump-raw-trace::
> Dump raw trace in ASCII.
>
> --g [type,min[,limit],order[,key][,branch]]::
> ---call-graph::
> - Display call chains using type, min percent threshold, optional print
> - limit and order.
> - type can be either:
> +-g::
> +--call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>::
> + Display call chains using type, min percent threshold, print limit,
> + call order, sort key and branch. Note that ordering of parameters is not
> + fixed so any parement can be given in an arbitraty order. One exception
> + is the print_limit which should be preceded by threshold.
> +
> + print_type can be either:
> - flat: single column, linear exposure of call chains.
> - - graph: use a graph tree, displaying absolute overhead rates.
> + - graph: use a graph tree, displaying absolute overhead rates. (default)
> - fractal: like graph, but displays relative rates. Each branch of
> - the tree is considered as a new profiled object. +
> + the tree is considered as a new profiled object.
> + - none: disable call chain display.
> +
> + threshold is a percentage value which specifies a minimum percent to be
> + included in the output call graph. Default is 0.5 (%).
> +
> + print_limit is only applied when stdio interface is used. It's to limit
> + number of call graph entries in a single hist entry. Note that it needs
> + to be given after threshold (but not necessarily consecutive).
> + Default is 0 (unlimited).
>
> order can be either:
> - callee: callee based call graph.
> - caller: inverted caller based call graph.
> + Default is 'caller' when --children is used, otherwise 'callee'.
>
> - key can be:
> - - function: compare on functions
> + sort_key can be:
> + - function: compare on functions (default)
> - address: compare on individual code addresses
>
> branch can be:
> - - branch: include last branch information in callgraph
> - when available. Usually more convenient to use --branch-history
> - for this.
> -
> - Default: graph,0.5,caller
> + - branch: include last branch information in callgraph when available.
> + Usually more convenient to use --branch-history for this.
>
> --children::
> Accumulate callchain of children to parent entry so that then can
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 1a117623d396..2740d7a82ae8 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1010,7 +1010,8 @@ static struct record record = {
> },
> };
>
> -const char record_callchain_help[] = CALLCHAIN_RECORD_HELP;
> +const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
> + "\n\t\t\t\tDefault: fp";
>
> /*
> * XXX Will stay a global variable till we fix builtin-script.c to stop messing
> @@ -1058,7 +1059,7 @@ struct option __record_options[] = {
> NULL, "enables call-graph recording" ,
> &record_callchain_opt),
> OPT_CALLBACK(0, "call-graph", &record.opts,
> - "mode[,dump_size]", record_callchain_help,
> + "record_mode[,record_size]", record_callchain_help,
> &record_parse_callchain_opt),
> OPT_INCR('v', "verbose", &verbose,
> "be more verbose (show counter open errors, etc)"),
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 545c51cef7f7..50dd4d3d8667 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -625,8 +625,11 @@ parse_percent_limit(const struct option *opt, const char *str,
> return 0;
> }
>
> -const char report_callchain_help[] = "Display callchains using " CALLCHAIN_REPORT_HELP ". "
> - "Default: graph,0.5,caller";
> +#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function"
> +
> +const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
> + CALLCHAIN_REPORT_HELP
> + "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
>
> int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> {
> @@ -636,7 +639,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> bool has_br_stack = false;
> int branch_mode = -1;
> bool branch_call_mode = false;
> - char callchain_default_opt[] = "graph,0.5,caller";
> + char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
> const char * const report_usage[] = {
> "perf report [<options>]",
> NULL
> @@ -703,7 +706,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
> "Only display entries with parent-match"),
> OPT_CALLBACK_DEFAULT('g', "call-graph", &report,
> - "output_type,min_percent[,print_limit],call_order[,branch]",
> + "print_type,threshold[,print_limit],order,sort_key[,branch]",
> report_callchain_help, &report_parse_callchain_opt,
> callchain_default_opt),
> OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index af849b1d7389..7e2e72e6d9d1 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1093,7 +1093,8 @@ parse_percent_limit(const struct option *opt, const char *arg,
> return 0;
> }
>
> -const char top_callchain_help[] = CALLCHAIN_RECORD_HELP ", " CALLCHAIN_REPORT_HELP;
> +const char top_callchain_help[] = CALLCHAIN_RECORD_HELP CALLCHAIN_REPORT_HELP
> + "\n\t\t\t\tDefault: fp,graph,0.5,caller,function";
>
> int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
> {
> @@ -1173,7 +1174,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
> NULL, "enables call-graph recording and display",
> &callchain_opt),
> OPT_CALLBACK(0, "call-graph", &top.record_opts,
> - "mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]",
> + "record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]",
> top_callchain_help, &parse_callchain_opt),
> OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
> "Accumulate callchains of children and show total overhead as well"),
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index aaf467c9ef2b..fce8161e54db 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -7,17 +7,29 @@
> #include "event.h"
> #include "symbol.h"
>
> -#define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace) recording: "
> +#define HELP_PAD "\t\t\t\t"
> +
> +#define CALLCHAIN_HELP "setup and enables call-graph (stack chain/backtrace):\n\n"
>
> #ifdef HAVE_DWARF_UNWIND_SUPPORT
> -#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP "fp dwarf lbr"
> +# define RECORD_MODE_HELP HELP_PAD "record_mode:\tcall graph recording mode (fp|dwarf|lbr)\n"
> #else
> -#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP "fp lbr"
> +# define RECORD_MODE_HELP HELP_PAD "record_mode:\tcall graph recording mode (fp|lbr)\n"
> #endif
>
> -#define CALLCHAIN_REPORT_HELP "output_type (graph, flat, fractal, or none), " \
> - "min percent threshold, optional print limit, callchain order, " \
> - "key (function or address), add branches"
> +#define RECORD_SIZE_HELP \
> + HELP_PAD "record_size:\tif record_mode is 'dwarf', max size of stack recording (<bytes>)\n" \
> + HELP_PAD "\t\tdefault: 8192 (bytes)\n"
> +
> +#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP RECORD_MODE_HELP RECORD_SIZE_HELP
> +
> +#define CALLCHAIN_REPORT_HELP \
> + HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|none)\n" \
> + HELP_PAD "threshold:\tminimum call graph inclusion threshold (<percent>)\n" \
> + HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
> + HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
> + HELP_PAD "sort_key:\tcall graph sort key (function|address)\n" \
> + HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n"
>
> enum perf_call_graph_mode {
> CALLCHAIN_NONE,
> --
> 2.6.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/