Re: [PATCH v1] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support

From: Nicolin Chen
Date: Tue Jun 27 2023 - 20:13:40 EST


Thanks for the reply.

On Wed, Jun 28, 2023 at 12:29:52AM +0100, Robin Murphy wrote:

> > > > Also, add STRTAB_STE_1_SHCFG_NONSHAREABLE of the default configuration
> > > > to distinguish from STRTAB_STE_1_SHCFG_INCOMING of the bypass one.
> > >
> > > Why? The "default configuration" is that the S1 shareability attribute
> > > is determined by the S1 translation itself, so the incoming value is
> > > irrelevant.
> >
> > That was for a consistency since the driver set the SHCFG field
> > to 0x0 (STRTAB_STE_1_SHCFG_NONSHAREABLE). I was not quite sure,
> > in a long run, if leaving an uncleared s1_cfg->shcfg potentially
> > can result in an unexpected behavior if it's passed in the STE.
> > Yet, we could be seemingly sure that the !IOMMU_DOMAIN_IDENTITY
> > means the S1 translation must be enabled and so the SHCFG would
> > be irrelevant?
> >
> > If so, I make make it:
> >
> > + if (smmu_domain->domain.type == IOMMU_DOMAIN_IDENTITY) {
> > + cfg->s1dss = STRTAB_STE_1_S1DSS_BYPASS;
> > + cfg->shcfg = STRTAB_STE_1_SHCFG_INCOMING;
> > + } else {
> > + cfg->s1dss = STRTAB_STE_1_S1DSS_SSID0;
> > + }
>
> What I mean is we don't need a cfg->shcfg field at all - without loss of
> generality it can simply be hard-coded to 1 when S1 is active, same as
> for stream bypass.

OK.
--------------------------------------------------------------------------------------------------
@@ -1350,7 +1350,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid,
dst[1] = cpu_to_le64(
- FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) |
+ FIELD_PREP(STRTAB_STE_1_S1DSS, s1_cfg->s1dss) |
+ FIELD_PREP(STRTAB_STE_1_SHCFG, STRTAB_STE_1_SHCFG_INCOMING) |
FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) |
--------------------------------------------------------------------------------------------------

> The only case where explicitly setting STE.SHCFG=0 makes some sense is
> for a stage-2-only domain if a device's incoming attribute is stronger
> than it needs to be, but even then there are multiple levels of
> IMP-DEFness around whether SHCFG actually does anything anyway.

I see. Thanks for elaborating.

> > > > @@ -2198,7 +2206,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain,
> > > > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > > struct arm_smmu_device *smmu = smmu_domain->smmu;
> > > >
> > > > - if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> > > > + /*
> > > > + * A master with a pasid capability might need a CD table, so only set
> > > > + * ARM_SMMU_DOMAIN_BYPASS if IOMMU_DOMAIN_IDENTITY and non-pasid master
> > > > + */
> > > > + if (domain->type == IOMMU_DOMAIN_IDENTITY && !master->ssid_bits) {
> > > > smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> > > > return 0;
> > > > }
> > >
> > > This means we'll now go on to allocate a pagetable for an identity
> > > domain, which doesn't seem ideal :/
> >
> > Do you suggest to bypass alloc_io_pgtable_ops()? That would zero
> > out the TCR fields in the CD. Not sure if it'd work seamlessly,
> > but I can give it a try.
>
> I think if there's a good reason to support this then it's worth

There is an unignorable perf difference that we see on a real HW.
So the reason or (I should say) the requirement is pretty strong.

> supporting properly, i.e. refactor a bit harder to separate the CD table
> parts which are common to both S1DSS bypass and S1 translation, from the
> CD/pagetable parts that are only relevant for translation. S1DSS bypass
> remains the same as Stream bypass in the sense that there is no
> structure corresponding to the identity domain itself, so not only does
> it not make sense to have a pagetable, there's also no valid place to
> put one anyway - touching the CD belonging to SSID 0 is strictly wrong.

I can try that. Yet, I think the S1DSS bypass case still belongs
to ARM_SMMU_DOMAIN_S1/arm_smmu_domain_finalise_s1, right?

I'd try keeping most of the parts intact while adding a pointer
to a structure holding pagetable stuff, to make it cleaner. Then
the S1DSS bypass case can be flagged by an empty pointer.

Thanks
Nic