Re: [RFC PATCH] iommu/arm-smmu: Introduction of ACTLR for custom prefetch setting

From: Bibek Kumar Patro
Date: Fri Sep 22 2023 - 06:23:02 EST




On 9/18/2023 4:49 PM, Robin Murphy wrote:
On 2023-09-15 20:52, Bibek Kumar Patro wrote:
Hi community,

I have a requirement which I'm looking for inputs from the community.
Currently in QCOM SOCs ACTLR register is used to store the custom
prefetcher setting by each client opting to use this feature.
This helps to store the next set of instructions to be prefetched
as per the value stored.This could help in a significant performance bump.
Requirement is to create an universal and flexible interface to store
and set this prefetch value which is unique for each SID.

Currently this ACTLR property is stored for each client as
DT property of smmu has following fields as mentioned:

    actlr-cells = <no of fields>
    actlrs = <SID MASK ACTLR_value>

These entries are parsed in arm-smmu layer and is stored in a table
containing actrl values corresponding to each SID and MASK. This ACTLR
value is then used during contextbank initialisation. Hence this entire
DT entry process is being carried out by arm-smmu layer.

I'm trying to envise a design where client can set this property in
their own DT entry as per their required value in case they have this
use-case. Here ACTLR value can be populated as 3rd optional field
in iommu-cells property which is already being used by clients to store
SID and MASK(optional) as mentioned:

    iommu-cells = <SID MASK(optional) ACTLR value(optional)>

This would help to avoid additional property in client DT as
exisiting DT property can be extrapolated to store the same. And this
prefetcher value can be set into the ARM_SMMU_CB_ACTLR register
during context bank inititalisation.
This patch is still WIP, so would like to get inputs and advise
from community on this design implementation or any alternate approach
to this requirement.

At the very least this would need to be in a implementation-specific backend, since everything about ACTLR is implementation-defined; there could be bits in there that the driver needs to manage itself and clients have absolutely no business overriding (e.g. the MMU-500 errata workarounds). The generic driver can't know what's valid, nor what the consequences are of not being able to satisfy a particular setting. Then there's still the question of what if two clients ask for different settings but want to attach to the same context?


Thanks Robin for you inputs. We had some rounds of discussions
internally after going through your concerns mentioned above.

It's also questionable whether this would belong in DT at all, since it has a bit of a smell of software policv about it.

ACTLR data which we feed into this register is client specific but at
the same time client won't have the provision to set a value as per
their choice.Similar to what SID and MASK value currently is which
is specific for each client but at the same time client don't have
the liberty to choose these values as per their choice.Hence this
was the initial motivation to move it to client specific DT property.
Also this would have helped us to match and set each ACTLR value with
corresponding SID and MASK for each client during the device attachment
to SMMU.

For many years we have been supporting ACTLR settings on all of our
SoCs with a device tree table. Iur downstream implementation is at [1].
This has worked us very well and did not see any need for client
specific customization as such. Since passing register content via
device tree is highly discouraged and also since ACTLR is
implementation-defined as rightly mentioned for which client doesn't
have the liberty to choose the value to be set here, we are moving the
configuration data to driver and apply these settings as per SoC needs.

We will host these data now in implementation specific backend driver
as suggested in Lookup table kind of format. Table will contain the SID
MASK and ACTLR value which can be used to populate the prefetch value
for each context bank during SMMU configuration probe.
This implementation should also prevent clients having liberty to go
and modify the actlr entry.

I'll send in the next RFC patch for review of the driver based design.

[1]
https://git.codelinaro.org/clo/la/kernel/msm-5.10/-/blob/kernel.lnx.5.10.r3-rel/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c#L2198


If it could be sufficiently justified then it would need a proper binding proposal, and "write this opaque value into this register" type properties are generally not approved of.

Thanks,
Robin.


Thanks,
Bibek

Signed-off-by: Bibek Kumar Patro <quic_bibekkum@xxxxxxxxxxx>
---
  drivers/iommu/arm/arm-smmu/arm-smmu.c | 14 ++++++++++++++
  drivers/iommu/arm/arm-smmu/arm-smmu.h |  6 ++++++
  2 files changed, 20 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index a86acd76c1df..2dd3796e30ea 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -917,11 +917,20 @@ static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx)
      arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_S2CR(idx), reg);
  }

+static void arm_smmu_write_actlr(struct arm_smmu_device *smmu, int idx)
+{
+    struct arm_smmu_actlr *actlr = smmu->actlrs + idx;
+
+    u32 reg = FIELD_PREP(ARM_SMMU_CB_ACTLR, actlrs->actlr);
+}
+
  static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx)
  {
      arm_smmu_write_s2cr(smmu, idx);
      if (smmu->smrs)
          arm_smmu_write_smr(smmu, idx);
+    if (smmu->actlrs)
+        arm_smmu_write_actlr(smmu, idx);
  }

  /*
@@ -1031,6 +1040,7 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
      for_each_cfg_sme(cfg, fwspec, i, idx) {
          u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
          u16 mask = FIELD_GET(ARM_SMMU_SMR_MASK, fwspec->ids[i]);
+        u32 actlr = FIELD_GET(ARM_SMMU_CB_ACTLR, fwspec->ids[i]);

          if (idx != INVALID_SMENDX) {
              ret = -EEXIST;
@@ -1524,6 +1534,10 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)

      if (args->args_count > 1)
          fwid |= FIELD_PREP(ARM_SMMU_SMR_MASK, args->args[1]);
+
+    if (args->args_count > 2)
+        fwid |= FIELD_PREP(ARM_SMMU_CB_ACTLR, args->args[2]);
+
      else if (!of_property_read_u32(args->np, "stream-match-mask", &mask))
          fwid |= FIELD_PREP(ARM_SMMU_SMR_MASK, mask);

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 703fd5817ec1..7d402ab58dc8 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -274,6 +274,11 @@ struct arm_smmu_smr {
      bool                pinned;
  };

+struct arm_smmu_actlr {
+    struct arm_smmu_smr smr;
+    u32 actlr;
+};
+
  struct arm_smmu_device {
      struct device            *dev;

@@ -312,6 +317,7 @@ struct arm_smmu_device {
      u16                smr_mask_mask;
      struct arm_smmu_smr        *smrs;
      struct arm_smmu_s2cr        *s2crs;
+    struct arm_smmu_actlr        *actlrs;
      struct mutex            stream_map_mutex;

      unsigned long            va_size;
--
2.17.1