Re: [PATCH v5 6/6] swiotlb: Remove pointless stride adjustment for allocations >= PAGE_SIZE

From: Petr Tesařík
Date: Mon Mar 04 2024 - 10:55:19 EST


On Mon, 4 Mar 2024 13:37:56 +0000
Robin Murphy <robin.murphy@xxxxxxx> wrote:

> On 04/03/2024 11:00 am, Petr Tesařík wrote:
> [...]
> >> Here's my take on tying all the threads together. There are
> >> four alignment combinations:
> >>
> >> 1. alloc_align_mask: zero; min_align_mask: zero
> >> 2. alloc_align_mask: zero; min_align_mask: non-zero
> >> 3. alloc_align_mask: non-zero; min_align_mask: zero/ignored
> >> 4. alloc_align_mask: non-zero; min_align_mask: non-zero
> >
> > What does "min_align_mask: zero/ignored" mean? Under which
> > circumstances should be a non-zero min_align_mask ignored?
> >
> >> xen_swiotlb_map_page() and dma_direct_map_page() are #1 or #2
> >> via swiotlb_map() and swiotlb_tbl_map_single()
> >>
> >> iommu_dma_map_page() is #3 and #4 via swiotlb_tbl_map_single()
> >>
> >> swiotlb_alloc() is #3, directly to swiotlb_find_slots()
> >>
> >> For #1, the returned physical address has no constraints if
> >> the requested size is less than a page. For page size or
> >> greater, the discussed historical requirement for page
> >> alignment applies.
> >>
> >> For #2, min_align_mask governs the bits of the returned
> >> physical address that must match the original address. When
> >> needed, swiotlb must also allocate pre-padding aligned to
> >> IO_TLB_SIZE that precedes the returned physical address. A
> >> request size <= swiotlb_max_mapping_size() will not exceed
> >> IO_TLB_SEGSIZE even with the padding. The historical
> >> requirement for page alignment does not apply because the
> >> driver has explicitly used the newer min_align_mask feature.
> >
> > What is the idea here? Is it the assumption that only old drivers rely
> > on page alignment, so if they use min_align_mask, it proves that they
> > are new and must not rely on page alignment?
>
> Yes, if a driver goes out of its way to set a min_align_mask which is
> smaller than its actual alignment constraint, that is clearly the
> driver's own bug. Strictly we only need to be sympathetic to drivers
> which predate min_align_mask, when implicitly relying on page alignment
> was all they had.
>
> >> For #3, alloc_align_mask specifies the required alignment. No
> >> pre-padding is needed. Per earlier comments from Robin[1],
> >> it's reasonable to assume alloc_align_mask (i.e., the granule)
> >> is >= IO_TLB_SIZE. The original address is not relevant in
> >> determining the alignment, and the historical page alignment
> >> requirement does not apply since alloc_align_mask explicitly
> >> states the alignment.
>
> FWIW I'm also starting to wonder about getting rid of the alloc_size
> argument and just have SWIOTLB round the end address up to
> alloc_align_mask itself as part of all these calculations. Seems like it
> could potentially end up a little simpler, maybe?
>
> >> For #4, the returned physical address must match the bits
> >> in the original address specified by min_align_mask. swiotlb
> >> swiotlb must also allocate pre-padding aligned to
> >> alloc_align_mask that precedes the returned physical address.
> >> Also per Robin[1], assume alloc_align_mask is >=
> >> min_align_mask, which solves the conflicting alignment
> >> problem pointed out by Petr[2]. Perhaps we should add a
> >> "WARN_ON(alloc_align_mask < min_align_mask)" rather than
> >> failing depending on which bits of the original address are
> >> set. Again, the historical requirement for page alignment does
> >> not apply.
> >
> > AFAICS the only reason this works in practice is that there are only
> > two in-tree users of min_align_mask: NVMe and Hyper-V. Both use a mask
> > of 12 bits, and the IOVA granule size is never smaller than 4K.
>
> If we assume a nonzero alloc_align_mask exclusively signifies iommu-dma,
> then for this situation SWIOTLB should only need to worry about the
> intersection of alloc_align_mask & min_align_mask, since any
> min_align_mask bits larger than the IOVA granule would need to be
> accounted for in the IOVA allocation regardless of SWIOTLB.

Ah, right, it's not limited to bounce buffers.

> > If we want to rely on this, then I propose to make a BUG_ON() rather
> > than WARN_ON().
>
> I've just proposed a patch to make it not matter for now - the nature of
> iommu-dma makes it slightly more awkward to prevent SWIOTLB from ever
> seeing this condition at all, so I chose not to do that, but as long as
> swiotlb_tbl_map_single() does *something* for conflicting constraints
> without completely falling over, which swiotlb_tbl_unmap_single can then
> undo again, then it should be fine.

Yes. It may allocate an unsuitably aligned bounce buffer, or it may
fail, but your IOMMU patch will continue to work (and also cover the
non-SWIOTLB case).

I believe this patch series is now good as is, except the commit
message should make it clear that alloc_align_mask and min_align_mask
can both be zero, but that simply means no alignment constraints.

Thanks,
Petr T