RE: [PATCH 1/1] perf,tools: add time out to force stop endless mmap processing

From: Liang, Kan
Date: Fri Jun 12 2015 - 13:05:12 EST


>
> On 6/12/15 8:42 AM, Liang, Kan wrote:
> >
> >>
> >> On 6/11/15 12:47 PM, Andi Kleen wrote:
> >>>> Can you elaborate on an example? I don't see how this can happen
> >>>> reading a maps file. And it does not read maps for all threads only
> >>>> thread group leaders.
> >>>
> >>> This is with a stress test case that generates lots of small
> >>> mappings at very high speed and frees them again. So the maps file
> >>> keeps changing faster than the proc reader can keep it and it can
> >>> end up with a live lock.
> >>
> >> Can you pass it along? I'd like to see how the task_diag proposal handles
> it.
> >>
> >> https://github.com/dsahern/linux/commits/task_diag-wip
> >
> > Hi David,
> >
> > I tried the task_diag on my platform, but it shows error message when
> > I run perf top. " Message handling failed: rc -1, errno 25".
> > And it looks perf top failed to get maps information.
>
> Not surprising; it's only half-baked. Can you try perf-record? So far that is
> the only one I have tested.
>

Perf record cannot reproduce the infinite loop which found in perf top.
But we can observe that synthesized threads took very long time in perf record.

According to test result as below, current perf cost 13s to read the maps,
while task_diag cost 14s to synthesized thread.
(Note: The time will increase with the test run.)

So it looks task_diag doesn't help on this issue.

[perf]$ sudo ./perf record -e instructions:pp --pid 14560
Reading /proc/14560/maps cost 13.12690599 s
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.108 MB perf.data (2783 samples) ]

[perf]$ sudo ./perf_task_diag record -e instructions:pp --pid 14560
synthesized threads took 14.435450 sec
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.035 MB perf.data (885 samples) ]


> Also, while running that kernel you can build the test programs under
> tools/testing/selftests/task_diag/ and try task_diag_all. I am away from my
> dev box at the moment. As I recall you will want to try 'task_diag_all o $pid'
> or 'task_diag_all a'
>
Neither options work on my platform.

[task_diag]$ sudo ./task_diag_all a
Unable to receive message: Operation not supported
[task_diag]$ sudo ./task_diag_all o 14751
Unable to receive message: Operation not supported

> I take this to mean you don't want to share the test program? I am curious
> as to how other tools handle this use case.

It's our internal test case. I'm afraid I cannot share it.
But if you want to do more tests, I'd like to do it.

Thanks,
Kan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/