Re: [PATCH 3/4] coresight: trbe: Work around the invalid prohibited states

From: Suzuki K Poulose
Date: Wed Jan 05 2022 - 08:54:41 EST


On 05/01/2022 11:16, Anshuman Khandual wrote:


On 1/5/22 3:43 PM, Suzuki K Poulose wrote:
Hi Anshuman

On 05/01/2022 05:05, Anshuman Khandual wrote:
TRBE implementations affected by Arm erratum #2038923 might get TRBE into
an inconsistent view on whether trace is prohibited within the CPU. As a
result, the trace buffer or trace buffer state might be corrupted. This
happens after TRBE buffer has been enabled by setting TRBLIMITR_EL1.E,
followed by just a single context synchronization event before execution
changes from a context, in which trace is prohibited to one where it isn't,
or vice versa. In these mentioned conditions, the view of whether trace is
prohibited is inconsistent between parts of the CPU, and the trace buffer
or the trace buffer state might be corrupted.

Work around this problem in the TRBE driver by preventing an inconsistent
view of whether the trace is prohibited or not based on TRBLIMITR_EL1.E by
immediately following a change to TRBLIMITR_EL1.E with at least one ISB
instruction before an ERET, or two ISB instructions if no ERET is to take
place. This adds a new cpu errata in arm64 errata framework and also
updates TRBE driver as required.

Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
Cc: Suzuki Poulose <suzuki.poulose@xxxxxxx>
Cc: coresight@xxxxxxxxxxxxxxxx
Cc: linux-doc@xxxxxxxxxxxxxxx
Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
---
  Documentation/arm64/silicon-errata.rst       |  2 +
  arch/arm64/Kconfig                           | 23 ++++++++++
  arch/arm64/kernel/cpu_errata.c               |  9 ++++
  arch/arm64/tools/cpucaps                     |  1 +
  drivers/hwtracing/coresight/coresight-trbe.c | 47 +++++++++++++++-----
  5 files changed, 72 insertions(+), 10 deletions(-)

As with the previous patch, it may be a good idea to split the
patch to arm64 and trbe parts.

Sure, will do.



diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index c9b30e6c2b6c..e0ef3e9a4b8b 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -54,6 +54,8 @@ stable kernels.
  +----------------+-----------------+-----------------+-----------------------------+
  | ARM            | Cortex-A510     | #2064142        | ARM64_ERRATUM_2064142       |
  +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A510     | #2038923        | ARM64_ERRATUM_2038923       |
++----------------+-----------------+-----------------+-----------------------------+
  | ARM            | Cortex-A53      | #826319         | ARM64_ERRATUM_826319        |
  +----------------+-----------------+-----------------+-----------------------------+
  | ARM            | Cortex-A53      | #827319         | ARM64_ERRATUM_827319        |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 2105b68d88db..026e34fb6fad 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -796,6 +796,29 @@ config ARM64_ERRATUM_2064142
          If unsure, say Y.
  +config ARM64_ERRATUM_2038923
+    bool "Cortex-A510: 2038923: workaround TRBE corruption with enable"
+    depends on CORESIGHT_TRBE
+    default y
+    help
+      This option adds the workaround for ARM Cortex-A510 erratum 2038923.
+
+      Affected Cortex-A510 core might cause an inconsistent view on whether trace is
+      prohibited within the CPU. As a result, the trace buffer or trace buffer state
+      might be corrupted. This happens after TRBE buffer has been enabled by setting
+      TRBLIMITR_EL1.E, followed by just a single context synchronization event before
+      execution changes from a context, in which trace is prohibited to one where it
+      isn't, or vice versa. In these mentioned conditions, the view of whether trace
+      is prohibited is inconsistent between parts of the CPU, and the trace buffer or
+      the trace buffer state might be corrupted.
+
+      Work around this in the driver by preventing an inconsistent view of whether the
+      trace is prohibited or not based on TRBLIMITR_EL1.E by immediately following a
+      change to TRBLIMITR_EL1.E with at least one ISB instruction before an ERET, or
+      two ISB instructions if no ERET is to take place.
+
+      If unsure, say Y.
+
  config CAVIUM_ERRATUM_22375
      bool "Cavium erratum 22375, 24313"
      default y
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index cbb7d5a9aee7..60b0c1f1d912 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -607,6 +607,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
          ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2)
      },
  #endif
+#ifdef CONFIG_ARM64_ERRATUM_2038923
+    {
+        .desc = "ARM erratum 2038923",
+        .capability = ARM64_WORKAROUND_2038923,
+
+        /* Cortex-A510 r0p0 - r0p2 */
+        ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2)
+    },
+#endif
      {
      }
  };
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index fca3cb329e1d..45a06d36d080 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -56,6 +56,7 @@ WORKAROUND_1463225
  WORKAROUND_1508412
  WORKAROUND_1542419
  WORKAROUND_2064142
+WORKAROUND_2038923
  WORKAROUND_TRBE_OVERWRITE_FILL_MODE
  WORKAROUND_TSB_FLUSH_FAILURE
  WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index ec24b62b2cec..0689c6dab96d 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -92,11 +92,13 @@ struct trbe_buf {
  #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE    0
  #define TRBE_WORKAROUND_WRITE_OUT_OF_RANGE    1
  #define TRBE_WORKAROUND_SYSREG_WRITE_FAILURE    2
+#define TRBE_WORKAROUND_CORRUPTION_WITH_ENABLE    3
    static int trbe_errata_cpucaps[] = {
      [TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
      [TRBE_WORKAROUND_WRITE_OUT_OF_RANGE] = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE,
      [TRBE_WORKAROUND_SYSREG_WRITE_FAILURE] = ARM64_WORKAROUND_2064142,
+    [TRBE_WORKAROUND_CORRUPTION_WITH_ENABLE] = ARM64_WORKAROUND_2038923,
      -1,        /* Sentinel, must be the last entry */
  };
  @@ -174,6 +176,11 @@ static inline bool trbe_may_fail_sysreg_write(struct trbe_cpudata *cpudata)
      return trbe_has_erratum(cpudata, TRBE_WORKAROUND_SYSREG_WRITE_FAILURE);
  }
  +static inline bool trbe_may_corrupt_with_enable(struct trbe_cpudata *cpudata)
+{

minor nit: trbe_needs_{ctxt_sync, isb}_after_enable() ?

trbe_needs_ctxt_sync_after_enable() sounds better. Also will have to change
the index above as well .. TRBE_NEEDS_CTXT_SYNC_AFTER_ENABLE.


+    return trbe_has_erratum(cpudata, TRBE_WORKAROUND_CORRUPTION_WITH_ENABLE);
+}
+
  static int trbe_alloc_node(struct perf_event *event)
  {
      if (event->cpu == -1)
@@ -187,6 +194,30 @@ static inline void trbe_drain_buffer(void)
      dsb(nsh);
  }
  +static inline void set_trbe_enabled(struct trbe_cpudata *cpudata)
+{
+    u64 trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);

minor nit: This implies we do the TRBE programming in the following
manner in the common case (i.e, TRBE enabled in the beginning of a
session).
  -> set TRBE LIMIT
  -> read TRBE LIMIT
  -> set TRBE ENABLED

Could we please optimize this ? I believe the buf->trbe_limit
must hold the LIMITR value at any point in time. And thus this

But is not bit risky though ! We have got the following places where
given trbe_limit instance changes its value.

drivers/../coresight-trbe.c: buf->trbe_limit = buf->trbe_base + nr_pages * PAGE_SIZE;
drivers/../coresight-trbe.c: buf->trbe_limit = compute_trbe_buffer_limit(handle);
drivers/../coresight-trbe.c: buf->trbe_limit -= PAGE_SIZE;

Those are the places where we compute the trbe_limit, *before*
we enable the TRBE. And we don't change recompute the limit
*without disabling* the TRBE. To make it more clear, the
only place where we set TRBE enabled without "computing"
the trbe_limit is when we hit a spurious interrupt.
But the value in the TRBLIMITR should already match the
buf->trbe_limit and we are only going to re-enable the
TRBE with the same limit. The other option is to
pass down the "limit" to the set_trbe_enabled().


function could simply be :

set_trbe_enabled(trbe_buf)
{
    limitr = trbe_buf->limit | LIMITR_ENABLE
    write(limitr, TRBLIMITR_EL1);
    ...
}

Is the potential for performance improvement here, out weigh possible
risks of using buf->trbe_limit directly while enabling the TRBE ?

I somehow don't like the fact that we have additional write and read
for the most common case of the TRBE usage (i.e, for arm_trbe_enable()).
If we could avoid that, that may be better.

Cheers
Suzuki