RE: [PATCH 1/1] swiotlb: Fix swiotlb_bounce() to do partial sync's correctly

From: Michael Kelley
Date: Fri Mar 29 2024 - 11:24:24 EST


From: Dominique Martinet <dominique.martinet@xxxxxxxxxxxxxxxxx> Sent: Thursday, March 28, 2024 10:09 PM
>
> Dominique Martinet wrote on Wed, Mar 27, 2024 at 03:05:18PM +0900:
> > Unfortunately that was ages ago so I don't really remember exactly on
> > which device that was reproduced.. Given the Cc I'm sure Lukas had hit
> > it on the MNT reform (i.MX8MQ), but I did say I tested it so I probably
> > could reproduce on my i.MX8MP?
> > I'll try to give a try at reproducing the old bug, and if I do test your
> > fix over next week.
>
> grmbl, sorry I cannot reproduce the problem on devices I have readily
> accessible, and don't have time to dig out my reform to test there in
> the forseeable future, so cannot confirm if that also fixes the problem
> we reported two years ago.
>
> However I had misunderstood your patch, I thought you were also
> reverting commit 5f89468e2f06 ("swiotlb: manipulate orig_addr when
> tlb_addr has offset") but you're keeping it and just making it signed --
> this should cause no problem for the use case I was concerned about as
> it fell within the bounds I had defined, so this is basically a no-op
> patch for my usecase and I don't expect this particular failure to pop
> back up here.
>
> Code-wise, I agree the checks I added in commit 868c9ddc182b ("swiotlb:
> add overflow checks to swiotlb_bounce") are too strict - I failed to
> consider the device minimum alignment part of swiotlb_align_offset, and
> thinking this through this can get weird.
> ... And now I'm looking again even with a big minimum alignment it's
> also too strict, so, right, let's work through an example.
>
>
> From my understanding of how orig_addr is computed, in the multi-block
> case we have something like this:
>
> (+ represent device's minimum alignment, bigger blocks with | are
> IO_TLB_SIZE blocks)
>
> 10 20 30 40 50 60
> 01234567890123456789012345678901234567890123456789012345678901234...
> |---+---+---+-block1+---+---+---|---+---+---+-block2+---+---+---|...
> ^ ^
> mem->slots[n].orig_addr mem->slots[n+1].orig_addr
> (=7) (=32+7=39)
>
> (memo of the code with your patch:
> index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
> orig_addr = mem->slots[index].orig_addr;
> swiotlb_align_offset(dev, orig_addr) = orig_addr & dev min align mask & (IO_TLB_SIZE-1)
> tlb_offset = (tlb_addr & (IO_TLB_SIZE - 1)) - swiotlb_align_offset(dev, orig_addr);
> orig_addr += tlb_offset;
> alloc_size -= tlb_offset;
> vaddr = mem->vaddr + tlb_addr - mem->start
> )
>
> So for this example we have IO_TLB_SIZE=32, dev min alignment=4,
> orig_addr's align value would be 7%4=3.
> Given say tlb_addr at 33, we'd find slot n, and compute
> tlb_offset = 1 - 3 = -2
>
> ... And then I just don't follow anymore?
>
> Computing the rest mechanically, for the example's sake let's say n=0
> so mem->start=7, index=0, and also set size to 4 bytes, vaddr to 0x1000.

I think your logic goes awry here. mem->vaddr is just the virtual
address equivalent of mem->start. See swiotlb_init_io_tlb_pool().
The computation here of vaddr is a back-handed way of getting
the virtual address equivalent of tlb_addr. For the purposes of
this discussion, we can just use tlb_addr and ignore the virt vs.
phys difference.

Your example is saying that the originally mapped area started at
orig_addr 7. You didn't specify the original size, but let's assume
it is at least 40. Since you've specified that swiotlb_bounce() is
called with a tlb_addr of 33, let's assume the tlb_addr for the
full mapped area as returned by swiotlb_tbl_map_single() is 3.
It must end in 0x3 because with dev min alignment = 4, the
low order two bits of the tlb_addr of the full mapped area
must match the low order two bits of the orig_addr.

Continuing your example, subsequently swiotlb_bounce() is
called with a tlb_addr of 33, and size 4. So we want to copy
4 bytes starting at the 30th byte (33 - 3) of the originally mapped
area, which is address 7 + 30 = 37.

In this case,
* tlb_offset = 1 - 3 = -2, as you describe above
* orig_addr = 39 + -2 = 37. The computation uses 39 from
slot[1], not the 7 from slot[0]. This computed 37 is the
correct orig_addr to use for the memcpy().
* size is still 4. There's no computation in swiotlb_bounce()
that changes "size".
* alloc_size is pulled from slot[1], and is adjusted by tlb_offset.
This adjusted alloc_size isn't used for anything except as a sanity
check against "size".

So I think your example works correctly with the updated
code. Note that if a driver calls swiotlb_bounce() with a
tlb_addr that is out-of-range, swiotlb_bounce() can't detect
that. A bogus "size" parameter _is_ detected because of
the check against the adjusted alloc_size.

Michael

>
> vaddr = 0x1000 + 33 - 7 = 0x1000 + 26
> orig_addr = 7 + -2 = 5
> size = 4 - -2 = 6
>
> then we'd proceed to memcpy either (vaddr, p2v(orig_addr), size) or the
> other way around, but this cannot be right:
> - size is bigger than what was requested, I fail to see how that can be
> allowed. I'd understand a smaller size assuming swiotlb_bounce gets
> called for each interval, but not a bigger size.
> - orig_addr is nowhere near 33.
>
>
> I thought this didn't make sense because of the minimum device
> alignment, but even with device alignment >= io tlb's with the very same
> example we would get
> tld_offset = 1 - 7 = -6
> now that one could make sense if we had used the following slot e.g.
> orig_addr being slot[1].orig_addr and we'd get back to 31, but that's
> not the case, and the size calculation is still off.
>
>
> So, long story short it took me half a day to get back into this code
> and the only thing I understand about it is that I don't understand it.
>
> I'm sure it works most of the case because everything is nicely aligned
> (since nobody complained about my checks before, and if there's no
> warning with these the code works), but I'd require some convincing to
> give a reviewed-by tag to this patch.
>
> Thanks for working on this though, I'll be happy to be pointed out at
> flaws in my logic or to look at another attempt...!!
> --
> Dominique