Re: [PATCH v5 6/6] swiotlb: Remove pointless stride adjustment for allocations >= PAGE_SIZE

From: Robin Murphy
Date: Mon Mar 04 2024 - 08:38:46 EST


On 04/03/2024 11:00 am, Petr Tesařík wrote:
[...]
Here's my take on tying all the threads together. There are
four alignment combinations:

1. alloc_align_mask: zero; min_align_mask: zero
2. alloc_align_mask: zero; min_align_mask: non-zero
3. alloc_align_mask: non-zero; min_align_mask: zero/ignored
4. alloc_align_mask: non-zero; min_align_mask: non-zero

What does "min_align_mask: zero/ignored" mean? Under which
circumstances should be a non-zero min_align_mask ignored?

xen_swiotlb_map_page() and dma_direct_map_page() are #1 or #2
via swiotlb_map() and swiotlb_tbl_map_single()

iommu_dma_map_page() is #3 and #4 via swiotlb_tbl_map_single()

swiotlb_alloc() is #3, directly to swiotlb_find_slots()

For #1, the returned physical address has no constraints if
the requested size is less than a page. For page size or
greater, the discussed historical requirement for page
alignment applies.

For #2, min_align_mask governs the bits of the returned
physical address that must match the original address. When
needed, swiotlb must also allocate pre-padding aligned to
IO_TLB_SIZE that precedes the returned physical address. A
request size <= swiotlb_max_mapping_size() will not exceed
IO_TLB_SEGSIZE even with the padding. The historical
requirement for page alignment does not apply because the
driver has explicitly used the newer min_align_mask feature.

What is the idea here? Is it the assumption that only old drivers rely
on page alignment, so if they use min_align_mask, it proves that they
are new and must not rely on page alignment?

Yes, if a driver goes out of its way to set a min_align_mask which is smaller than its actual alignment constraint, that is clearly the driver's own bug. Strictly we only need to be sympathetic to drivers which predate min_align_mask, when implicitly relying on page alignment was all they had.

For #3, alloc_align_mask specifies the required alignment. No
pre-padding is needed. Per earlier comments from Robin[1],
it's reasonable to assume alloc_align_mask (i.e., the granule)
is >= IO_TLB_SIZE. The original address is not relevant in
determining the alignment, and the historical page alignment
requirement does not apply since alloc_align_mask explicitly
states the alignment.

FWIW I'm also starting to wonder about getting rid of the alloc_size argument and just have SWIOTLB round the end address up to alloc_align_mask itself as part of all these calculations. Seems like it could potentially end up a little simpler, maybe?

For #4, the returned physical address must match the bits
in the original address specified by min_align_mask. swiotlb
swiotlb must also allocate pre-padding aligned to
alloc_align_mask that precedes the returned physical address.
Also per Robin[1], assume alloc_align_mask is >=
min_align_mask, which solves the conflicting alignment
problem pointed out by Petr[2]. Perhaps we should add a
"WARN_ON(alloc_align_mask < min_align_mask)" rather than
failing depending on which bits of the original address are
set. Again, the historical requirement for page alignment does
not apply.

AFAICS the only reason this works in practice is that there are only
two in-tree users of min_align_mask: NVMe and Hyper-V. Both use a mask
of 12 bits, and the IOVA granule size is never smaller than 4K.

If we assume a nonzero alloc_align_mask exclusively signifies iommu-dma, then for this situation SWIOTLB should only need to worry about the intersection of alloc_align_mask & min_align_mask, since any min_align_mask bits larger than the IOVA granule would need to be accounted for in the IOVA allocation regardless of SWIOTLB.
If we want to rely on this, then I propose to make a BUG_ON() rather
than WARN_ON().

I've just proposed a patch to make it not matter for now - the nature of iommu-dma makes it slightly more awkward to prevent SWIOTLB from ever seeing this condition at all, so I chose not to do that, but as long as swiotlb_tbl_map_single() does *something* for conflicting constraints without completely falling over, which swiotlb_tbl_unmap_single can then undo again, then it should be fine.

Thanks,
Robin.