Re: [PATCH] iommu: Improve the performance for direct_mapping

From: Robin Murphy
Date: Thu Nov 26 2020 - 10:19:35 EST


On 2020-11-20 09:06, Yong Wu wrote:
Currently direct_mapping always use the smallest pgsize which is SZ_4K
normally to mapping. This is unnecessary. we could gather the size, and
call iommu_map then, iommu_map could decide how to map better with the
just right pgsize.

From the original comment, we should take care overlap, otherwise,
iommu_map may return -EEXIST. In this overlap case, we should map the
previous region before overlap firstly. then map the left part.

Each a iommu device will call this direct_mapping when its iommu
initialize, This patch is effective to improve the boot/initialization
time especially while it only needs level 1 mapping.

Signed-off-by: Anan Sun <anan.sun@xxxxxxxxxxxx>
Signed-off-by: Yong Wu <yong.wu@xxxxxxxxxxxx>
---
drivers/iommu/iommu.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index df87c8e825f7..854a8fcb928d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -737,6 +737,7 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
/* We need to consider overlapping regions for different devices */
list_for_each_entry(entry, &mappings, list) {
dma_addr_t start, end, addr;
+ size_t unmapped_sz = 0;
if (domain->ops->apply_resv_region)
domain->ops->apply_resv_region(dev, domain, entry);
@@ -752,10 +753,25 @@ static int iommu_create_device_direct_mappings(struct iommu_group *group,
phys_addr_t phys_addr;
phys_addr = iommu_iova_to_phys(domain, addr);
- if (phys_addr)
+ if (phys_addr == 0) {
+ unmapped_sz += pg_size; /* Gather the size. */
continue;
+ }

I guess the reason we need to validate every page is because they may already have been legitimately mapped if someone else's reserved region overlaps - is it worth explicitly validating that, i.e. bail out if something's gone wrong enough that phys_addr != addr?

Other than the naming issue (I agree that map_size is a far, far better choice), I don't have any strong opinions about the rest of the implementation - I've written enough variations of this pattern to know that there's just no "nice" way to do it in C; all you can do is shuffle the clunkiness around :)

Robin.

- ret = iommu_map(domain, addr, addr, pg_size, entry->prot);
+ if (unmapped_sz) {
+ /* Map the region before the overlap. */
+ ret = iommu_map(domain, start, start,
+ unmapped_sz, entry->prot);
+ if (ret)
+ goto out;
+ start += unmapped_sz;
+ unmapped_sz = 0;
+ }
+ start += pg_size;
+ }
+ if (unmapped_sz) {
+ ret = iommu_map(domain, start, start, unmapped_sz,
+ entry->prot);
if (ret)
goto out;
}