RE: [PATCH V4 1/2] perf,tools: add time out to force stop proc map processing

From: Liang, Kan
Date: Mon Jun 22 2015 - 14:02:12 EST



> Em Wed, Jun 17, 2015 at 09:51:10AM -0400, kan.liang@xxxxxxxxx escreveu:
> > From: Kan Liang <kan.liang@xxxxxxxxx>
> > System wide sampling like 'perf top' or 'perf record -a' read all
> > threads /proc/xxx/maps before sampling. If there are any threads which
> > generating a keeping growing huge maps, perf will do infinite loop
> > during synthesizing. Nothing will be sampled.
> >
> > This patch fixes this issue by adding per-thread timeout to force stop
> > this kind of endless proc map processing.
> > PERF_RECORD_MISC_PROC_MAP_PARSE_TIME_OUT is introduced to
> indicate
> > that the mmap record are truncated by time out. User will get warning
> > notification when truncated mmap records are detected.
>
> I am applying the patch, we indeed need to keep it in the last
> PERF_RECORD_MMAP2 we synthesized for a proc mmap file, i.e. it has to
> be stored in the perf.data file, so that later, at 'report' time, possibly in
> another machine, that situation can be reported to the user.
>
> But, the warning is only being showm for tools that process events via
> perf_session__process_events(), like 'perf report'.
>
> 'top' and 'trace', for instance, don't do that, so the users will not be warned
> in that case.

The users can be warned right away by pr_warning when time out is detected.
I tested both 'top' and 'trace'.

# perf trace
Reading /proc/18099/maps time out. You may want to increase the time limit
by --proc-map-timeout
Reading /proc/18100/maps time out. You may want to increase the time limit
by --proc-map-timeout

For 'top', the error message will be printed at the bottom of the screen.
The user may not notice it with default --tui mode, only if there is very small
number of threads time out. (My test case has 64 threads time out. The error
messages last 32 seconds, the user must notice that.)

So the problem is only for some case of 'top'.
>
> To fix this we need to make machine__process_mmap2_event() notice
> that PERF_RECORD_MISC_PROC_MAP_PARSE_TIME_OUT is set, and
> bump a per-struct machine counter, something like machine-
> >stats.nr_truncated_mmaps.

machine__process_mmap2_event also be called by 'record', 'report', etc.
Since it's an issue only for 'top'.
How about update nr_truncated_mmaps in machine__process_event?

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5cfc3aa..95a1dc5 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -952,6 +952,16 @@ static int __cmd_top(struct perf_top *top)

machine__synthesize_threads(&top->session->machines.host, &opts->target,
top->evlist->threads, false, opts->proc_map_timeout);
+
+ if (top->session->machines.host.nr_truncated_mmaps)
+ ui__warning("%d map information files for pre-existing threads were\n"
+ "not processed, if there are samples for addresses they\n"
+ "will not be resolved, you may find out which are these\n"
+ "threads by running with --stdio\n"
+ "The time limit to process proc map is too short?\n"
+ "Increase it by --proc-map-timeout\n",
+ top->session->machines.host.nr_truncated_mmaps);
+
ret = perf_top__start_counters(top);
if (ret)
goto out_delete;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 968cea8..7d960ff 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1438,6 +1438,8 @@ int machine__process_event(struct machine *machine, union perf_event *event,
case PERF_RECORD_MMAP:
ret = machine__process_mmap_event(machine, event, sample); break;
case PERF_RECORD_MMAP2:
+ if (event->header.misc & PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT)
+ machine->nr_truncated_mmaps++;
ret = machine__process_mmap2_event(machine, event, sample); break;
case PERF_RECORD_FORK:
ret = machine__process_fork_event(machine, event, sample); break;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 4333a3a..af74955 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -38,6 +38,7 @@ struct machine {
struct map_groups kmaps;
struct map *vmlinux_maps[MAP__NR_TYPES];
u64 kernel_start;
+ u32 nr_truncated_mmaps;
symbol_filter_t symbol_filter;
pid_t *current_tid;
union { /* Tool specific area */

Thanks,
Kan
>
> The callchain is:
>
> builtin-top.c
> machine__synthesize_threads()
> __machine__synthesize_threads()
> perf_event__process()
> machine__process_event()
> machine__process_mmap2_event()
>
> At some point even a event_stats may make sense to be added to struct
> machine :-)
>
> - Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/