Re: [PATCH] perf stat: Poll for monitored tasks being alive in fork mode

From: Jiri Olsa
Date: Sun Jan 06 2019 - 08:25:20 EST


On Sat, Jan 05, 2019 at 11:16:40AM +0800, Jin, Yao wrote:
>
>
> On 1/4/2019 8:54 PM, Jiri Olsa wrote:
> > On Fri, Jan 04, 2019 at 10:28:17AM +0800, Jin Yao wrote:
> > > Following test shows the stat keeps running even if no longer
> > > task to monitor (mgen exits at ~5s).
> > >
> > > perf stat -e cycles -p `pgrep mgen` -I1000 -- sleep 10
> > > time counts unit events
> > > 1.000148916 1,308,365,864 cycles
> > > 2.000379171 1,297,269,875 cycles
> > > 3.000556719 1,297,187,078 cycles
> > > 4.000914241 761,261,827 cycles
> > > 5.001306091 <not counted> cycles
> > > 6.001676881 <not counted> cycles
> > > 7.002046336 <not counted> cycles
> > > 8.002405651 <not counted> cycles
> > > 9.002766625 <not counted> cycles
> > > 10.001395827 <not counted> cycles
> > >
> > > We'd better finish stat immediately if there's no longer task to
> > > monitor.
> > >
> > > After:
> > >
> > > perf stat -e cycles -p `pgrep mgen` -I1000 -- sleep 10
> > > time counts unit events
> > > 1.000180062 1,236,592,661 cycles
> > > 2.000421539 1,223,733,572 cycles
> > > 3.000609910 1,297,047,663 cycles
> > > 4.000807545 1,297,215,816 cycles
> > > 5.001001578 1,297,208,032 cycles
> > > 6.001390345 582,343,659 cycles
> > > sleep: Terminated
> > >
> > > Now the stat exits immediately when the monitored tasks ends.
> > >
> > > Signed-off-by: Jin Yao <yao.jin@xxxxxxxxxxxxxxx>
> > > ---
> > > tools/perf/builtin-stat.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > index 63a3afc..71f3bc8 100644
> > > --- a/tools/perf/builtin-stat.c
> > > +++ b/tools/perf/builtin-stat.c
> > > @@ -553,6 +553,13 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> > > if (interval || timeout) {
> > > while (!waitpid(child_pid, &status, WNOHANG)) {
> > > + if (!is_target_alive(&target,
> > > + evsel_list->threads) &&
> > > + (child_pid != -1)) {
> >
> > do we need that child_pid check? we just returned from waitpid
> > so we should be ok.. we just make the race window smaller
> >
> > could we just do:
> >
> > if (!is_target_alive(&target, evsel_list->threads)) {
> > kill(child_pid, SIGTERM);
> > break;
> > }
> >
>
> I think this code should be OK and I have tested yet. I have a question
> about the race condition, we really don't need a lock to protect the
> child_pid?
>
> skip_signal()
> {
> /*
> * render child_pid harmless
> * won't send SIGTERM to a random
> * process in case of race condition
> * and fast PID recycling
> */
> child_pid = -1;
> }
>
> __run_perf_stat()
> {
> ....
> kill(child_pid, SIGTERM);
> }
>
> If child_pid is set by -1 in a small window between checking of child_pid
> and kill(), then kill(-1, SIGTERM) may happen. All processes except the kill
> process itself and init would receive SIGTERM.

ah right, -1 is special.. however that can still happen also
in the orginal patch.. how about we do something like below

jirka


---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index acfd48db52dd..c322cb271180 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -583,6 +583,14 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)

if (interval || timeout) {
while (!waitpid(child_pid, &status, WNOHANG)) {
+ if (!is_target_alive(&target, evsel_list->threads)) {
+ int pid = child_pid;
+
+ if (pid != -1)
+ kill(pid, SIGTERM);
+ break;
+ }
+
nanosleep(&ts, NULL);
if (timeout)
break;