Re: [PATCH v2 1/3] swiotlb: Fix allocation alignment requirement when searching slots

From: Robin Murphy
Date: Wed Jan 31 2024 - 10:54:21 EST


On 31/01/2024 12:25 pm, Will Deacon wrote:
Commit bbb73a103fbb ("swiotlb: fix a braino in the alignment check fix"),
which was a fix for commit 0eee5ae10256 ("swiotlb: fix slot alignment
checks"), causes a functional regression with vsock in a virtual machine
using bouncing via a restricted DMA SWIOTLB pool.

When virtio allocates the virtqueues for the vsock device using
dma_alloc_coherent(), the SWIOTLB search fails to take into account the
8KiB buffer size and returns page-unaligned allocations if 'area->index'
was left unaligned by a previous allocation from the buffer:

Hmm, but isn't this fundamentally swiotlb_alloc()'s fault for assuming it's going to get a page-aligned address back despite asking for 0 alignment in the first place? I'm not sure SWIOTLB has ever promised implicit size-alignment, so it feels somewhat misplaced to be messing with the algorithm before fixing the obvious issue in the caller :/

Cheers,
Robin.

# Final address in brackets is the SWIOTLB address returned to the caller
| virtio-pci 0000:00:07.0: orig_addr 0x0 alloc_size 0x2000, iotlb_align_mask 0x800 stride 0x2: got slot 1645-1649/7168 (0x98326800)
| virtio-pci 0000:00:07.0: orig_addr 0x0 alloc_size 0x2000, iotlb_align_mask 0x800 stride 0x2: got slot 1649-1653/7168 (0x98328800)
| virtio-pci 0000:00:07.0: orig_addr 0x0 alloc_size 0x2000, iotlb_align_mask 0x800 stride 0x2: got slot 1653-1657/7168 (0x9832a800)

This ends in tears (typically buffer corruption and/or a hang) because
swiotlb_alloc() blindly returns a pointer to the 'struct page'
corresponding to the allocation and therefore the first half of the page
ends up being allocated twice.

Fix the problem by treating the allocation alignment separately to any
additional alignment requirements from the device, using the maximum
of the two as the stride to search the buffer slots.

Fixes: bbb73a103fbb ("swiotlb: fix a braino in the alignment check fix")
Fixes: 0eee5ae10256 ("swiotlb: fix slot alignment checks")
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
Cc: Robin Murphy <robin.murphy@xxxxxxx>
Cc: Petr Tesarik <petr.tesarik1@xxxxxxxxxxxxxxxxxxx>
Cc: Dexuan Cui <decui@xxxxxxxxxxxxx>
Signed-off-by: Will Deacon <will@xxxxxxxxxx>
---
kernel/dma/swiotlb.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index b079a9a8e087..56cc08b1fbd6 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -982,7 +982,7 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool
phys_to_dma_unencrypted(dev, pool->start) & boundary_mask;
unsigned long max_slots = get_max_slots(boundary_mask);
unsigned int iotlb_align_mask =
- dma_get_min_align_mask(dev) | alloc_align_mask;
+ dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1);
unsigned int nslots = nr_slots(alloc_size), stride;
unsigned int offset = swiotlb_align_offset(dev, orig_addr);
unsigned int index, slots_checked, count = 0, i;
@@ -993,19 +993,18 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool
BUG_ON(!nslots);
BUG_ON(area_index >= pool->nareas);
+ /*
+ * For mappings with an alignment requirement don't bother looping to
+ * unaligned slots once we found an aligned one.
+ */
+ stride = get_max_slots(max(alloc_align_mask, iotlb_align_mask));
+
/*
* For allocations of PAGE_SIZE or larger only look for page aligned
* allocations.
*/
if (alloc_size >= PAGE_SIZE)
- iotlb_align_mask |= ~PAGE_MASK;
- iotlb_align_mask &= ~(IO_TLB_SIZE - 1);
-
- /*
- * For mappings with an alignment requirement don't bother looping to
- * unaligned slots once we found an aligned one.
- */
- stride = (iotlb_align_mask >> IO_TLB_SHIFT) + 1;
+ stride = max(stride, PAGE_SHIFT - IO_TLB_SHIFT + 1);
spin_lock_irqsave(&area->lock, flags);
if (unlikely(nslots > pool->area_nslabs - area->used))
@@ -1015,14 +1014,16 @@ static int swiotlb_search_pool_area(struct device *dev, struct io_tlb_pool *pool
index = area->index;
for (slots_checked = 0; slots_checked < pool->area_nslabs; ) {
- slot_index = slot_base + index;
+ phys_addr_t tlb_addr;
- if (orig_addr &&
- (slot_addr(tbl_dma_addr, slot_index) &
- iotlb_align_mask) != (orig_addr & iotlb_align_mask)) {
+ slot_index = slot_base + index;
+ tlb_addr = slot_addr(tbl_dma_addr, slot_index);
+
+ if ((tlb_addr & alloc_align_mask) ||
+ (orig_addr && (tlb_addr & iotlb_align_mask) !=
+ (orig_addr & iotlb_align_mask))) {
index = wrap_area_index(pool, index + 1);
slots_checked++;
- continue;
}
if (!iommu_is_span_boundary(slot_index, nslots,