Re: perf_evlist__filter_pollfd() in trace__run()

From: Arnaldo Carvalho de Melo
Date: Fri Jan 30 2015 - 10:22:18 EST


Em Thu, Jan 29, 2015 at 03:55:22PM -0800, Sukadev Bhattiprolu escreveu:
> Arnaldo,
>
> On one of our systems we are seeing an intermittent SIGSEGV with
>
> perf trace sleep 1
>
> and I have question about the 'draining' flag below:
> |
> | From 46fb3c21d20415dd2693570c33d0ea6eb8745e04 Mon Sep 17 00:00:00 2001
> | From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> | Date: Mon, 22 Sep 2014 14:39:48 -0300
> | Subject: [PATCH 1/1] perf trace: Filter out POLLHUP'ed file descriptors
> |
> | So that we don't continue polling on vanished file descriptors, i.e.
> | file descriptors for events monitoring threads that exited.
> |
> | I.e. the following 'trace' command now exits as expected, instead
> | of staying in an eternal loop:
> |
> | $ sleep 5s &
> | $ trace -p `pidof sleep`
> |
> | Reported-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> | Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> | Cc: David Ahern <dsahern@xxxxxxxxx>
> | Cc: Don Zickus <dzickus@xxxxxxxxxx>
> | Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> | Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> | Cc: Mike Galbraith <efault@xxxxxx>
> | Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> | Cc: Paul Mackerras <paulus@xxxxxxxxx>
> | Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> | Cc: Stephane Eranian <eranian@xxxxxxxxxx>
> | Link: http://lkml.kernel.org/n/tip-6qegv786zbf6i8us6t4rxug9@xxxxxxxxxxxxxx
> | Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> | ---
> | tools/perf/builtin-trace.c | 7 ++++++-
> | 1 file changed, 6 insertions(+), 1 deletion(-)
> |
> | diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> | index b8fedf3..fe39dc6 100644
> | --- a/tools/perf/builtin-trace.c
> | +++ b/tools/perf/builtin-trace.c
> | @@ -2044,6 +2044,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
> | int err = -1, i;
> | unsigned long before;
> | const bool forks = argc > 0;
> | + bool draining = false;
> | char sbuf[STRERR_BUFSIZE];
> |
> | trace->live = true;
> | @@ -2171,8 +2172,12 @@ next_event:
> | if (trace->nr_events == before) {
> | int timeout = done ? 100 : -1;
> |
> | - if (perf_evlist__poll(evlist, timeout) > 0)
> | + if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
> | + if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
> | + draining = true;
> | +
>
> If an fd gets into POLLHUP state, perf_evlist__filter_pollfd() removes
> ("puts") the mmap for the fd. We are seeing that sometimes (frequently)
> _all_ fds are in the POLLHUP state and hence their mmap->base are set
> to NULL.
>
>
> | goto again;
>
> Now with this goto, we go back and call perf_evlist__mmap_read() which
> tries to access the freed mmaps.
>
> Should there be another check to before reading the mmap again ?

Possibly, checking, but a similar algorithm should be in place for
'record', do you see any problems there? I.e. with 'perf record sleep
1'?

- Arnaldo

> I must add that I don't get the SIGSEGV on recent perf-core and the
> system where we get the crash, first runs into the following
> errors that we are still looking into (maybe related to "ppc64le"
> architecture).
>
> Problems reading syscall 45 information
> Problems reading syscall 5 information
> Problems reading syscall 5 information
> Problems reading syscall 108 information
> Problems reading syscall 108 information
> Problems reading syscall 90 information
> Problems reading syscall 90 information
> Problems reading syscall 6 information
>
> Unlike the SIGSEGV, these errors occur always.
>
> | + }
> | } else {
> | goto again;
> | }
> | --
> | 1.8.3.1
> |
>
> Following hack seems to fix the SIGSEGV, but then we completely ignore
> 'draining' flag.
>
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index fb12645..ac25e16 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -2173,8 +2173,10 @@ next_event:
> int timeout = done ? 100 : -1;
>
> if (!draining && perf_evlist__poll(evlist, timeout) > 0) {
> - if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0)
> + if (perf_evlist__filter_pollfd(evlist, POLLERR | POLLHUP) == 0) {
> draining = true;
> + goto out_disable;
> + }
>
> goto again;
> }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/