Hi Suzuki
On Tue, 22 Mar 2022 at 12:35, Suzuki Kuruppassery Poulose
<suzuki.poulose@xxxxxxx> wrote:
On 22/03/2022 11:38, Mike Leach wrote:
HI Suzuki,
On Tue, 22 Mar 2022 at 10:43, Suzuki Kuruppassery Poulose
<suzuki.poulose@xxxxxxx> wrote:
+ Cc: James Clark
Hi Mike,
On 08/03/2022 20:49, Mike Leach wrote:
The current method for allocating trace source ID values to sources is
to use a fixed algorithm for CPU based sources of (cpu_num * 2 + 0x10).
The STM is allocated ID 0x1.
This fixed algorithm is used in both the CoreSight driver code, and by
perf when writing the trace metadata in the AUXTRACE_INFO record.
The method needs replacing as currently:-
1. It is inefficient in using available IDs.
2. Does not scale to larger systems with many cores and the algorithm
has no limits so will generate invalid trace IDs for cpu number > 44.
Thanks for addressing this issue.
Additionally requirements to allocate additional system IDs on some
systems have been seen.
This patch set introduces an API that allows the allocation of trace IDs
in a dynamic manner.
Architecturally reserved IDs are never allocated, and the system is
limited to allocating only valid IDs.
Each of the current trace sources ETM3.x, ETM4.x and STM is updated to use
the new API.
perf handling is changed so that the ID associated with the CPU is read
from sysfs. The ID allocator is notified when perf events start and stop
so CPU based IDs are kept constant throughout any perf session.
For the ETMx.x devices IDs are allocated on certain events
a) When using sysfs, an ID will be allocated on hardware enable, and freed
when the sysfs reset is written.
b) When using perf, ID is allocated on hardware enable, and freed on
hardware disable.
For both cases the ID is allocated when sysfs is read to get the current
trace ID. This ensures that consistent decode metadata can be extracted
from the system where this read occurs before device enable.
Note: This patchset breaks backward compatibility for perf record.
Because the method for generating the AUXTRACE_INFO meta data has
changed, using an older perf record will result in metadata that
does not match the trace IDs used in the recorded trace data.
This mismatch will cause subsequent decode to fail. Older versions of
perf will still be able to decode data generated by the updated system.
I have some concerns over this and the future plans for the dynamic
allocation per sink. i.e., we are breaking/modifying the perf now to
accommodate the dynamic nature of the trace id of a given CPU/ETM.
I don't beleive we have a choice for this - we cannot retain what is
an essentially broken allocation mechanism.
I completely agree and I am happy with the current step by step approach
of moving to a dynamic allocation scheme. Apologies, this wasn't
conveyed appropriately.
The proposed approach of exposing this via sysfs may (am not sure if
this would be the case) break for the trace-id per sink change, as a
sink could assign different trace-id for a CPU depending.
If a path exists between a CPU and a sink - the current framework as
far as I can tell would not allow for a new path to be set up between
the cpu and another sink.
e.g, if we have concurrent perf sessions, in the future with sink based
allocation :
perf record -e cs_etm/@sink1/... payload1
perf record -e cs_etm/@sink2/... payload2
perf record -e cs_etm// ... payload3
The trace id allocated for first session for CPU0 *could* be different
from that of the second or the third.
If these sessions run concurrently then the same Trace ID will be used
for CPU0 for all the sessions.
We ensure this by notifications that a cs_etm session is starting and
stopping - and keep a refcount.