Re: [PATCH v2] iommu/arm-smmu-qcom: Rework the logic finding the bypass quirk

From: Robin Murphy
Date: Tue Mar 14 2023 - 11:25:36 EST


On 2023-03-14 14:32, Bjorn Andersson wrote:
On Tue, Mar 14, 2023 at 01:41:56PM +0000, Robin Murphy wrote:
On 2023-03-14 13:20, Manivannan Sadhasivam wrote:
On Tue, Mar 14, 2023 at 11:58:24AM +0000, Robin Murphy wrote:
On 2023-03-14 11:26, Manivannan Sadhasivam wrote:
On Tue, Mar 14, 2023 at 12:17:38PM +0100, Johan Hovold wrote:
On Tue, Mar 14, 2023 at 04:29:05PM +0530, Manivannan Sadhasivam wrote:
The logic used to find the quirky firmware that intercepts the writes to
S2CR register to replace bypass type streams with a fault, and ignore the
fault type, is not working with the firmware on newer SoCs like SC8280XP.

The current logic uses the last stream mapping group (num_mapping_groups
- 1) as an index for finding quirky firmware. But on SC8280XP, NUSMRG
reports a value of 162 (possibly emulated by the hypervisor) and logic is
not working for stream mapping groups > 128. (Note that the ARM SMMU
architecture specification defines NUMSMRG in the range of 0-127).

So the current logic that checks the (162-1)th S2CR entry fails to detect
the quirky firmware on these devices and SMMU triggers invalid context
fault for bypass streams.

To fix this issue, rework the logic to find the first non-valid (free)
stream mapping register group (SMR) within 128 groups and use that index
to access S2CR for detecting the bypass quirk. If no free groups are
available, then just skip the quirk detection.

While at it, let's move the quirk detection logic to a separate function
and change the local variable name from last_s2cr to free_s2cr.

Reviewed-by: Bjorn Andersson <andersson@xxxxxxxxxx>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
---

Changes in v2:

* Limited the check to 128 groups as per ARM SMMU spec's NUMSMRG range
* Moved the quirk handling to its own function
* Collected review tag from Bjorn

drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 48 ++++++++++++++++++----
1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index d1b296b95c86..48362d7ef451 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -266,25 +266,49 @@ static int qcom_smmu_init_context(struct arm_smmu_domain *smmu_domain,
return 0;
}
-static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
+static void qcom_smmu_bypass_quirk(struct arm_smmu_device *smmu)
{
- unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 1);
struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
- u32 reg;
- u32 smr;
+ u32 free_s2cr;
+ u32 reg, smr;
int i;
+ /*
+ * Find the first non-valid (free) stream mapping register group and
+ * use that index to access S2CR for detecting the bypass quirk.
+ *
+ * Note that only the first 128 stream mapping groups are considered for
+ * the check. This is because the ARM SMMU architecture specification
+ * defines NUMSMRG (Number of Stream Mapping Register Groups) in the
+ * range of 0-127, but some Qcom platforms emulate more stream mapping
+ * groups with the help of hypervisor. And those groups don't exhibit
+ * the quirky behavior.
+ */
+ for (i = 0; i < 128; i++) {

This may now access registers beyond smmu->num_mapping_groups. Should
you not use the minimum of these two values here (and below)?


Doh! yeah, you're right. Will fix it in v3.

FWIW I'd say it's probably best if the cfg_probe hook clamps
smmu->num_mapping_groups to the architectural maximum straight away, to also
prevent the main driver iterating off into the nonsensical area in
arm_smmu_device_reset() or the SMR allocator itself.


We considered that also but Qcom purposefully extended the NUMSMRG for
virtualization usecase and we do not have a clear picture of it.

Whatever that supposed use-case may be, Linux does not support it, and
clearly isn't going to support it any time soon if we don't even know what

Can you please elaborate on what it is that would prevent Linux to
handle hardware with more than 128 SMRs?

https://developer.arm.com/documentation/ihi0062/latest

I would expect actual hardware to follow the architecture (because it would need to pass validation suites etc.). The architecture defines an extension for supporting up to 1024 stream mapping groups, but that works very differently.

The SC8280XP DT claims this SMMU is compatible with a standard Arm MMU-500, which definitely does not support more than 128 SMRs, so I have no idea what the hypervisor might be up to.

it is. Therefore Linux does not need to accommodate this weirdness for the
foreseeable future, beyond simply making sure it doesn't cause any problems
for what Linux *does* support. It's bad enough that the emulation of
"normal" SMRs continues to violate the architecture, but I'm even more
uncomfortable letting the generic architecture driver poke at completely
non-architectural registers which don't even have the same behaviour as the
ones they're supposedly extending.

Afaict there's nothing special about the SMRs beyond 128 on this
platform...

If that were true then why would this patch be a thing at all? :/

Thanks,
Robin.