Re: [PATCH v5 3/3] Documentation: arm64: Document the PMU event counting threshold feature

From: James Clark
Date: Thu Nov 23 2023 - 10:45:19 EST




On 23/11/2023 05:50, Anshuman Khandual wrote:
>
>
> On 11/21/23 03:01, Namhyung Kim wrote:
>> On Mon, Nov 13, 2023 at 3:26 AM James Clark <james.clark@xxxxxxx> wrote:
>>> Add documentation for the new Perf event open parameters and
>>> the threshold_max capability file.
>>>
>>> Signed-off-by: James Clark <james.clark@xxxxxxx>
>>> ---
>>> Documentation/arch/arm64/perf.rst | 56 +++++++++++++++++++++++++++++++
>>> 1 file changed, 56 insertions(+)
>>>
>>> diff --git a/Documentation/arch/arm64/perf.rst b/Documentation/arch/arm64/perf.rst
>>> index 1f87b57c2332..36b8111a710d 100644
>>> --- a/Documentation/arch/arm64/perf.rst
>>> +++ b/Documentation/arch/arm64/perf.rst
>>> @@ -164,3 +164,59 @@ and should be used to mask the upper bits as needed.
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
>>> .. _tools/lib/perf/tests/test-evsel.c:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c
>>> +
>>> +Event Counting Threshold
>>> +==========================================
>>> +
>>> +Overview
>>> +--------
>>> +
>>> +FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
>>> +events whose count meets a specified threshold condition. For example if
>>> +threshold_compare is set to 2 ('Greater than or equal'), and the
>>> +threshold is set to 2, then the PMU counter will now only increment by
>>> +when an event would have previously incremented the PMU counter by 2 or
>>> +more on a single processor cycle.
>>> +
>>> +To increment by 1 after passing the threshold condition instead of the
>>> +number of events on that cycle, add the 'threshold_count' option to the
>>> +commandline.
>>> +
>>> +How-to
>>> +------
>>> +
>>> +The threshold, threshold_compare and threshold_count values can be
>>> +provided per event:
>>> +
>>> +.. code-block:: sh
>>> +
>>> + perf stat -e stall_slot/threshold=2,threshold_compare=2/ \
>>> + -e dtlb_walk/threshold=10,threshold_compare=3,threshold_count/
>> Can you please explain this a bit more?
>>
>> I guess the first event counts stall_slot PMU if the event if it's
>> greater than or equal to 2. And as threshold_count is not set,
>> it'd count the stall_slot as is. E.g. it counts 3 when it sees 3.
>
> Hence without 'threshold_count' being set, the other two config requests
> will not have an effect, is that correct ?

Yeah I can mention this. It's implied because 0 is the default value of
config fields, and 0 is a valid value for compare and count field, so
threshold=0 has to be the way to disable it. But I can mention it
explicitly.

>
>>
>> OTOH, dtlb_walk will count 1 if it sees an event less than 10.
>> Is my understanding correct?
>
> 'Equals' and 'Greater-than-or-equal' makes sense and are intuitive. Just
> wondering what will happen for 'Not-equal' and 'Less-than' - when would
> the counter count in such cases ?
>
> 0: Not-equal
> 1: Equals
> 2: Greater-than-or-equal
> 3: Less-than
>

They would count when the event is not equal to or less than the
threshold value on any cycle. Probably going into more detail would
start to reproduce what's in the reference manual. All the pseudocode is
in there which describes how it works.

As for use cases, I'm not really sure. It probably wasn't any effort to
add into the hardware with a single not gate, and something could have
been missed if it wasn't added. You might be able to do things like
count the inverse of something without having to open another event to
subtract from to find what the inverse would be.