Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use tlb_flush_all() for partial walk flush

From: Robin Murphy
Date: Tue Jun 22 2021 - 14:37:47 EST


On 2021-06-22 15:27, Sai Prakash Ranjan wrote:
Hi Robin,

On 2021-06-22 17:41, Robin Murphy wrote:
On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
Hi Robin,

On 2021-06-21 21:15, Robin Murphy wrote:
On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire context
with tlb_flush_all() callback in partial walk flush to improve unmap
performance on select few platforms where the cost of over-invalidation
is less than the unmap latency.

I still think this doesn't belong anywhere near io-pgtable at all.
It's a driver-internal decision how exactly it implements a non-leaf
invalidation, and that may be more complex than a predetermined
boolean decision. For example, I've just realised for SMMUv3 we can't
invalidate multiple levels of table at once with a range command,
since if we assume the whole thing is mapped at worst-case page
granularity we may fail to invalidate any parts which are mapped as
intermediate-level blocks. If invalidating a 1GB region (with 4KB
granule) means having to fall back to 256K non-range commands, we may
not want to invalidate by VA then, even though doing so for a 2MB
region is still optimal.

It's also quite feasible that drivers might want to do this for leaf
invalidations too - if you don't like issuing 512 commands to
invalidate 2MB, do you like issuing 511 commands to invalidate 2044KB?
- and at that point the logic really has to be in the driver anyway.


Ok I will move this to tlb_flush_walk() functions in the drivers. In the previous
v1 thread, you suggested to make the choice in iommu_get_dma_strict() test,
I assume you meant the test in iommu_dma_init_domain() with a flag or was it
the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in init_domain?

Yes, I meant literally inside the same condition where we currently
set "pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in
arm_smmu_init_domain_context().


Ok got it, thanks.

I am still a bit confused on where this flag would be? Should this be a part
of struct iommu_domain?

Well, if you were to rewrite the config with an alternative set of
flush_ops at that point it would be implicit. For a flag, probably
either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be less
useful than generalising straight to a "maximum number of by-VA
invalidations it's worth sending individually" threshold value?

But then we would still need some flag to make this implementation
specific (qcom specific for now) and this threshold would just be
another condition although it would have been useful if this was
generic enough.

Well, for that approach I assume we could do something like special-case 0, or if it's a mutable per-domain value maybe just initialise it to SIZE_MAX or whatever such that it would never be reached in practice. Whichever way, it was meant to be implied that anything at the domain level would still be subject to final adjustment by the init_context hook.

Robin.

It's clear to me what overall shape and separation of responsibility is
most logical, but beyond that I don't have a particularly strong
opinion on the exact implementation; I've just been chucking ideas
around :)


Your ideas are very informative and useful :)

Thanks,
Sai