Re: [PATCH v7 2/6] perf record: Get the first sample time and last sample time

From: Arnaldo Carvalho de Melo
Date: Thu Jan 04 2018 - 14:10:03 EST


Em Fri, Dec 08, 2017 at 09:13:42PM +0800, Jin Yao escreveu:
> In the default 'perf record' configuration, all samples are processed,
> to create the HEADER_BUILD_ID table. So it's very easy to get the
> first/last samples and save the time to perf file header via the
> function write_sample_time().
>
> Later, at post processing time, perf report/script will fetch
> the time from perf file header.

So, at this point I was expecting that that record would be present on
the perf.data file:

[acme@jouet perf]$ perf record --timestamp-boundary sleep 1
Cannot read kernel map
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (7 samples) ]
[acme@jouet perf]$ perf report -D | grep PERF_RECORD_SAMPLE | wc -l
7
[acme@jouet perf]$ perf report -D | grep PERF_RECORD_SAMPLE_TIME
[acme@jouet perf]$

What am I doing wrong?

To clarify, this is with just the first two patches in this series
applied.

- Arnaldo

> Change log:
> -----------
> v7: Just update the patch description according to Arnaldo's suggestion.
>
> v6: Currently '--buildid-all' is not enabled at default. So the walking
> on all samples is the default operation. There is no big overhead
> to calculate the timestamp boundary in process_sample_event handler
> once we already go through all samples. So the timestamp boundary
> calculation is enabled by default when '--buildid-all' is not enabled.
>
> While if '--buildid-all' is enabled, we creates a new option
> "--timestamp-boundary" for user to decide if it enables the
> timestamp boundary calculation.
>
> v5: There is an issue that the sample walking can only work when
> '--buildid-all' is not enabled. So we need to let the walking
> be able to work even if '--buildid-all' is enabled and let the
> processing skips the dso hit marking for this case.
>
> At first, I want to provide a new option "--record-time-boundaries".
> While after consideration, I think a new option is not very
> necessary.
>
> v3: Remove the definitions of first_sample_time and last_sample_time
> from struct record and directly save them in perf_evlist.
>
> Signed-off-by: Jin Yao <yao.jin@xxxxxxxxxxxxxxx>
> ---
> tools/perf/Documentation/perf-record.txt | 3 +++
> tools/perf/builtin-record.c | 18 +++++++++++++++---
> 2 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 5a626ef..3eea6de 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -430,6 +430,9 @@ Configure all used events to run in user space.
> --timestamp-filename
> Append timestamp to output file name.
>
> +--timestamp-boundary::
> +Record timestamp boundary (time of first/last samples).
> +
> --switch-output[=mode]::
> Generate multiple perf.data files, timestamp prefixed, switching to a new one
> based on 'mode' value:
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 98da8cb..6db7b5a 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -78,6 +78,7 @@ struct record {
> bool no_buildid_cache_set;
> bool buildid_all;
> bool timestamp_filename;
> + bool timestamp_boundary;
> struct switch_output switch_output;
> unsigned long long samples;
> };
> @@ -409,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool,
> {
> struct record *rec = container_of(tool, struct record, tool);
>
> - rec->samples++;
> + if (rec->evlist->first_sample_time == 0)
> + rec->evlist->first_sample_time = sample->time;
> +
> + rec->evlist->last_sample_time = sample->time;
>
> + if (rec->buildid_all)
> + return 0;
> +
> + rec->samples++;
> return build_id__mark_dso_hit(tool, event, sample, evsel, machine);
> }
>
> @@ -435,9 +443,11 @@ static int process_buildids(struct record *rec)
>
> /*
> * If --buildid-all is given, it marks all DSO regardless of hits,
> - * so no need to process samples.
> + * so no need to process samples. But if timestamp_boundary is enabled,
> + * it still needs to walk on all samples to get the timestamps of
> + * first/last samples.
> */
> - if (rec->buildid_all)
> + if (rec->buildid_all && !rec->timestamp_boundary)
> rec->tool.sample = NULL;
>
> return perf_session__process_events(session);
> @@ -1621,6 +1631,8 @@ static struct option __record_options[] = {
> "Record build-id of all DSOs regardless of hits"),
> OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
> "append timestamp to output filename"),
> + OPT_BOOLEAN(0, "timestamp-boundary", &record.timestamp_boundary,
> + "Record timestamp boundary (time of first/last samples)"),
> OPT_STRING_OPTARG_SET(0, "switch-output", &record.switch_output.str,
> &record.switch_output.set, "signal,size,time",
> "Switch output when receive SIGUSR2 or cross size,time threshold",
> --
> 2.7.4