[tip: perf/core] perf tools: Allow to build with -ltcmalloc

From: tip-bot2 for Jiri Olsa
Date: Mon Oct 21 2019 - 20:03:44 EST


The following commit has been merged into the perf/core branch of tip:

Commit-ID: bb91a073ed124d3f6224d8ac37ecb536c01970c1
Gitweb: https://git.kernel.org/tip/bb91a073ed124d3f6224d8ac37ecb536c01970c1
Author: Jiri Olsa <jolsa@xxxxxxxxxx>
AuthorDate: Sun, 13 Oct 2019 17:14:25 +02:00
Committer: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
CommitterDate: Tue, 15 Oct 2019 08:36:22 -03:00

perf tools: Allow to build with -ltcmalloc

By using "make TCMALLOC=1" you can enable perf to be build for usage
with libtcmalloc.so (gperftools).

Get heap profile (tools/perf directory):

$ <install gperftools>
$ make TCMALLOC=1 DEBUG=1
$ HEAPPROFILE=/tmp/heapprof ./perf ...
$ pprof ./perf /tmp/heapprof.000*
(pprof) top
Total: 2335.5 MB
1735.1 74.3% 74.3% 1735.1 74.3% memdup
402.0 17.2% 91.5% 402.0 17.2% zalloc
140.2 6.0% 97.5% 145.8 6.2% map__new
33.6 1.4% 98.9% 33.6 1.4% symbol__new
12.4 0.5% 99.5% 12.4 0.5% alloc_event
6.2 0.3% 99.7% 6.2 0.3% nsinfo__new
5.5 0.2% 100.0% 5.5 0.2% nsinfo__copy
0.3 0.0% 100.0% 0.3 0.0% dso__new
0.1 0.0% 100.0% 0.1 0.0% do_read_string
0.0 0.0% 100.0% 0.0 0.0% __GI__IO_file_doallocate

See callstack:
$ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
$ pprof --web ./perf /tmp/heapprof.00*

Committer testing:

Install gperftools, on fedora:

# dnf install gperftools-devel

Then build:

$ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin

Verify that it linked against the right library:

$ ldd ~/bin/perf | grep tcma
libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
$

Run 'perf trace' system wide for 1 minute:

# HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
<SNIP>
59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
59981.956 (10.143 ms): SCTP timer/21716 ... [continued]: select()) = 0 (Timeout)
59985.549 ( ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
0.926 (59999.481 ms): sleep/29764 ... [continued]: nanosleep()) = 0
59992.133 ( ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80) ...
60000.477 ( 0.009 ms): sleep/29764 close(fd: 1) = 0
60000.493 ( 0.005 ms): sleep/29764 close(fd: 2) = 0
60000.514 ( ): sleep/29764 exit_group() = ?
Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
[root@quaco ~]#

Install pprof:

# dnf install pprof

And run it:

# pprof ~/bin/perf /tmp/heapprof.0001.heap
Using local file /root/bin/perf.
Using local file /tmp/heapprof.0001.heap.
Welcome to pprof! For help, type 'help'.
(pprof) top
Total: 4.0 MB
1.7 42.0% 42.0% 2.2 54.1% map__new
0.9 23.3% 65.3% 0.9 23.3% zalloc
0.5 11.4% 76.7% 0.5 11.4% dso__new
0.2 5.6% 82.3% 0.3 8.5% trace__sys_enter
0.2 4.9% 87.2% 0.2 4.9% __GI___strdup
0.2 3.8% 91.0% 0.2 3.8% new_term
0.1 2.2% 93.2% 0.4 10.1% __perf_pmu__new_alias
0.0 1.0% 94.3% 0.0 1.2% event_read_fields
0.0 0.8% 95.1% 0.0 0.8% nsinfo__new
0.0 0.7% 95.8% 0.1 3.2% trace__read_syscall_info
(pprof)

Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
Tested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Michael Petlan <mpetlan@xxxxxxxxxx>
Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Link: http://lore.kernel.org/lkml/20191013151427.11941-2-jolsa@xxxxxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
---
tools/perf/Makefile.config | 5 +++++
tools/perf/Makefile.perf | 2 ++
2 files changed, 7 insertions(+)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 063202c..1783427 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -265,6 +265,11 @@ LDFLAGS += -Wl,-z,noexecstack

EXTLIBS = -lpthread -lrt -lm -ldl

+ifneq ($(TCMALLOC),)
+ CFLAGS += -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free
+ EXTLIBS += -ltcmalloc
+endif
+
ifeq ($(FEATURES_DUMP),)
include $(srctree)/tools/build/Makefile.feature
else
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index a099a8a..8f1ba98 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -114,6 +114,8 @@ include ../scripts/utilities.mak
# Define NO_LIBZSTD if you do not want support of Zstandard based runtime
# trace compression in record mode.
#
+# Define TCMALLOC to enable tcmalloc heap profiling.
+#

# As per kernel Makefile, avoid funny character set dependencies
unexport LC_ALL