Re: [PATCH v1] perf record: collect user registers set jointly with dwarf stacks

From: Alexey Budankov
Date: Thu Apr 18 2019 - 05:01:31 EST


On 17.04.2019 21:47, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 17, 2019 at 08:45:08PM +0300, Alexey Budankov escreveu:
>> Hi Arnaldo,
>>
>> On 17.04.2019 18:48, Arnaldo Carvalho de Melo wrote:
>>> On April 17, 2019 11:40:02 AM GMT-03:00, Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>>>> On Wed, Apr 17, 2019 at 11:35:42AM -0300, Arnaldo Carvalho de Melo
>>>> wrote:
>>>>> Em Wed, Apr 17, 2019 at 09:39:52AM +0200, Jiri Olsa escreveu:
>>>>>> On Mon, Apr 15, 2019 at 06:36:13PM +0300, Alexey Budankov wrote:
>>>>>>>
>>>>>>> When dwarf stacks are collected jointly with user specified
>>>> register
>>>>>>> set using --user-regs option like below the full register context
>>>> is
>>>>>>> still captured on a sample:
>>>>>>>
>>>>>>> $ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP
>>>> -- matrix.gcc.g.O3
>>>>>>>
>>>>>>> 188143843893585 0x6b48 [0x4f8]: PERF_RECORD_SAMPLE(IP, 0x4002):
>>>> 23828/23828: 0x401236 period: 1363819 addr: 0x7ffedbdd51ac
>>>>>>> ... FP chain: nr:0
>>>>>>> ... user regs: mask 0xff0fff ABI 64-bit
>>>>>>> .... AX 0x53b
>>>>>>> .... BX 0x7ffedbdd3cc0
>>>>>>> .... CX 0xffffffff
>>>>>>> .... DX 0x33d3a
>>>>>>> .... SI 0x7f09b74c38d0
>>>>>>> .... DI 0x0
>>>>>>> .... BP 0x401260
>>>>>>> .... SP 0x7ffedbdd3cc0
>>>>>>> .... IP 0x401236
>>>>>>> .... FLAGS 0x20a
>>>>>>> .... CS 0x33
>>>>>>> .... SS 0x2b
>>>>>>> .... R8 0x7f09b74c3800
>>>>>>> .... R9 0x7f09b74c2da0
>>>>>>> .... R10 0xfffffffffffff3ce
>>>>>>> .... R11 0x246
>>>>>>> .... R12 0x401070
>>>>>>> .... R13 0x7ffedbdd5db0
>>>>>>> .... R14 0x0
>>>>>>> .... R15 0x0
>>>>>>> ... ustack: size 1024, offset 0xe0
>>>>>>> . data_src: 0x5080021
>>>>>>> ... thread: stack_test2.g.O:23828
>>>>>>> ...... dso: /root/abudanko/stacks/stack_test2.g.O3
>>>>>>>
>>>>>>> After applying the change suggested in the patch the sample data
>>>> contain
>>>>>>> only user specified register values:
>>>>>>>
>>>>>>> $ perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP
>>>> -- matrix.gcc.g.03
>>>>>>>
>>>>>>> 188368474305373 0x5e40 [0x470]: PERF_RECORD_SAMPLE(IP, 0x4002):
>>>> 23839/23839: 0x401236 period: 1260507 addr: 0x7ffd3d85e96c
>>>>>>> ... FP chain: nr:0
>>>>>>> ... user regs: mask 0x1c0 ABI 64-bit
>>>>>>> .... BP 0x401260
>>>>>>> .... SP 0x7ffd3d85cc20
>>>>>>> .... IP 0x401236
>>>>>>> ... ustack: size 1024, offset 0x58
>>>>>>> . data_src: 0x5080021
>>>>>>> ... thread: stack_test2.g.O:23839
>>>>>>> ...... dso: /root/abudanko/stacks/stack_test2.g.O3
>>>>>>>
>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx>
>>>>>>
>>>>>> Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>
>>>>>
>>>>> So, there are registers that are needed to do the DWARF unwinding,
>>>>> right? But at the same time, if the user says only some are needed,
>>>> he
>>>>> better know what they're doing and ask for at least the registers
>>>> needed
>>>>> for the unwinding process to be successfull, right?
>>>>
>>>> yep, that's how understand that
>>>
>>> So we need to document that, stating that specifying a set of registers together with requesting DWARF callchains may break things.
>>
>> Do you mean break callchains if omitting IP,SP,BP?
>> For example like this:
>> $ perf record -g --call-graph dwarf,1024 --user-regs=AX,BX,CX -- matrix.gcc.g.O3
>
> Right, i.e. if you don't use --user-regs, then a set of registers will
> be asked for by --call-graph dwarf, right? If you use both and specify a

Right. Full register set is collected.

> subset that doesn't have some of the asked for --call-graph dwarf, what
> happens?

It reports corrupted trace:
./perf record -g --call-graph dwarf,1024 --user-regs=IP -- ./stack_test2.2048.g.O3
in foo() ...
in bar() ...
done
[ perf record: Woken up 254 times to write data ]
[ perf record: Captured and wrote 63.341 MB perf.data (59286 samples) ]
[root@nntvtune39 stacks]# ./perf report -D > log.txt
unwind: can't read reg 7
0x835c0 [0x8]: failed to process type: 68
Error:
failed to process sample

./perf record -g --call-graph dwarf,1024 --user-regs=SP -- ./stack_test2.2048.g.O3
in foo() ...
in bar() ...
done
[ perf record: Woken up 256 times to write data ]
[ perf record: Captured and wrote 63.863 MB perf.data (59773 samples) ]
[root@nntvtune39 stacks]# ./perf report -D > log.txt
0x83180 [0x8]: failed to process type: 68
Error:
failed to process sample

./perf record -g --call-graph dwarf,1024 --user-regs=IP,SP -- ./stack_test2.2048.g.O3
in foo() ...
in bar() ...
done
[ perf record: Woken up 258 times to write data ]
[ perf record: Captured and wrote 64.301 MB perf.data (59757 samples) ]
[root@nntvtune39 stacks]# ./perf report -D > log.txt
unwind: can't read reg 6

./perf record -g --call-graph dwarf,1024 --user-regs=IP,SP,BP -- ./stack_test2.2048.g.O3
in foo() ...
in bar() ...
done
[ perf record: Woken up 259 times to write data ]
[ perf record: Captured and wrote 64.737 MB perf.data (59739 samples) ]
[root@nntvtune39 stacks]# ./perf report -D > log.txt

It looks like some dwarf set IP,SP,BP has to be collected anyways
and the set has to be consolidated with the registers specified
using --user-regs option:

-g call-graph dwarf,K all_regs
-g call-graph dwarf,K --user-regs=user_regs dwarf_regs | user_regs
--user-regs=user_regs user_regs

~Alexey

>
> - Arnaldo
>