On 2022-04-28 15:55, Andi Kleen wrote:
On 4/28/2022 7:45 AM, Christoph Hellwig wrote:
On Thu, Apr 28, 2022 at 03:44:36PM +0100, Robin Murphy wrote:
Rather than introduce this extra level of allocator complexity, how aboutYeah. We're almost done removing all knowledge of swiotlb from drivers,
just dividing up the initial SWIOTLB allocation into multiple io_tlb_mem
instances?
so the very last thing I want is an interface that allows a driver to
allocate a per-device buffer.
At least for TDX need parallelism with a single device for performance.
So if you split up the io tlb mems for a device then you would need a new mechanism to load balance the requests for single device over those. I doubt it would be any simpler.
Eh, I think it would be, since the round-robin retry loop can then just sit around the existing io_tlb_mem-based allocator, vs. the churn of inserting it in the middle, plus it's then really easy to statically distribute different starting points across different devices via dev->dma_io_tlb_mem if we wanted to.
Admittedly the overall patch probably ends up about the same size, since it likely pushes a bit more complexity into swiotlb_init to compensate, but that's still a trade-off I like.