Re: [PATCH v5 6/6] swiotlb: Remove pointless stride adjustment for allocations >= PAGE_SIZE

From: Petr Tesařík
Date: Tue Mar 05 2024 - 10:16:13 EST


On Tue, 5 Mar 2024 11:20:13 +0000
Robin Murphy <robin.murphy@xxxxxxx> wrote:

> On 2024-03-04 6:22 pm, Michael Kelley wrote:
> > From: Robin Murphy <robin.murphy@xxxxxxx> Sent: Monday, March 4, 2024 8:54 AM
> >>
> >> On 04/03/2024 4:10 pm, Michael Kelley wrote:
> >>> From: Will Deacon <will@xxxxxxxxxx> Sent: Monday, March 4, 2024 8:02 AM
> >>>>
> >>>> Hi folks,
> >>>>
> >>>> On Mon, Mar 04, 2024 at 04:55:06PM +0100, Petr Tesařík wrote:
> >>>>> On Mon, 4 Mar 2024 13:37:56 +0000
> >>>>> Robin Murphy <robin.murphy@xxxxxxx> wrote:
> >>>>>> On 04/03/2024 11:00 am, Petr Tesařík wrote:
> >>>>>> [...]
> >>>>>>>> Here's my take on tying all the threads together. There are
> >>>>>>>> four alignment combinations:
> >>>>>>>>
> >>>>>>>> 1. alloc_align_mask: zero; min_align_mask: zero
> >>>>
> >>>> Based on this ^^^ ...
> >>>>
> >>>>>>>> xen_swiotlb_map_page() and dma_direct_map_page() are #1 or #2
> >>>>>>>> via swiotlb_map() and swiotlb_tbl_map_single()
> >>>>>>>>
> >>>>>>>> iommu_dma_map_page() is #3 and #4 via swiotlb_tbl_map_single()
> >>>>>>>>
> >>>>>>>> swiotlb_alloc() is #3, directly to swiotlb_find_slots()
> >>>>>>>>
> >>>>>>>> For #1, the returned physical address has no constraints if
> >>>>>>>> the requested size is less than a page. For page size or
> >>>>>>>> greater, the discussed historical requirement for page
> >>>>>>>> alignment applies.
> >>>>
> >>>> ... and this ^^^ ...
> >>>>
> >>>>
> >>>>> I believe this patch series is now good as is, except the commit
> >>>>> message should make it clear that alloc_align_mask and min_align_mask
> >>>>> can both be zero, but that simply means no alignment constraints.
> >>>>
> >>>> ... my (possibly incorrect!) reading of the thread so far is that we
> >>>> should preserve page-aligned allocation in this case if the allocation
> >>>> size is >= PAGE_SIZE.
> >>>>
> >>>> Something like the diff below, to replace this final patch?
> >>>>
> >>>> Will
> >>>>
> >>>> --->8
> >>>>
> >>>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> >>>> index c381a7ed718f..67eac05728c0 100644
> >>>> --- a/kernel/dma/swiotlb.c
> >>>> +++ b/kernel/dma/swiotlb.c
> >>>> @@ -992,6 +992,14 @@ static int swiotlb_search_pool_area(struct device
> >>>> *dev, struct io_tlb_pool *pool
> >>>> BUG_ON(!nslots);
> >>>> BUG_ON(area_index >= pool->nareas);
> >>>>
> >>>> + /*
> >>>> + * Historically, allocations >= PAGE_SIZE were guaranteed to be
> >>>> + * page-aligned in the absence of any other alignment requirements.
> >>>> + * Since drivers may be relying on this, preserve the old behaviour.
> >>>> + */
> >>>> + if (!alloc_align_mask && !iotlb_align_mask && alloc_size >= PAGE_SIZE)
> >>>> + alloc_align_mask = PAGE_SIZE - 1;
> >>>> +
> >>>
> >>> Yes, I think that should do it.
> >>
> >> In principle it might be more logical to fudge this into
> >> iotlb_align_mask rather than alloc_align_mask
> >
> > I'm not understanding what you are getting at, but maybe we are
> > interpreting the historical page alignment requirement differently.
> > I think of the page alignment requirement as independent of the
> > orig_addr -- the returned physical address should always be exactly
> > page aligned, and not offset to match bits in orig_addr. If that's
> > the case, then implementing the page alignment via
> > alloc_align_mask is logically the right place. Fudging into
> > iotlb_align_mask would do matching of bits in orig_addr.
> >
> > Or is there something else I'm not considering?
>
> In short, it's that alloc_align_mask is concerned with how slots are
> allocated, while min_align_mask is concerned with where the data itself
> is bounced (which may also place certain constraints on allocation).
>
> The reason this page-alignment was here in the first place was seemingly
> to serve the original swiotlb_alloc() path, and thus it could be
> considered functionally equivalent to what is now alloc_align_mask.
> However the side-effect it happened to also have for streaming mappings
> was to prevent sufficiently large page-aligned buffers being bounced to
> a non-page-aligned address, which apparently managed to work well enough
> for NVMe until 64K pages became more common and ruined things by being
> too big, and we formalised *that* desired effect into min_align_mask.
>
> I get that forcing io_tlb_align mask here would introduce a stronger
> constraint which affects non-page-aligned buffers as well, and wanting
> to avoid that is perhaps a reasonable concern, so once again I'm really
> just arguing semantics. Given the head-scratching we've been through
> over this already, I think it would be valuable to at least be clearer
> that a significant part of the "old behaviour" is to do the right thing
> for swiotlb_alloc(), which is very much still current and necessary, but
> the reason we're not setting alloc_align_mask there is because doing it
> here also preserves this legacy side-effect which acts as a limited
> version of min_align_mask to preserve page alignment when bouncing.

I'm confused. Doesn't PATCH 3/6 of this series change swiotlb_alloc()
to use alloc_align_mask (and finally do it properly for sizes greater
than page size)?

Anyway, the placement of streaming DMA buffers is not under control of
the device driver IIUC. These buffers may have been allocated without
even knowing that they would be once used for I/O. I cannot imagine how
a device driver could rely on its physical address beyond offfset
within a page. And even then it's subtly broken, because the size of
the mapping may be less than PAGE_SIZE (e.g. 512 bytes as traditional
sector size under direct I/O). Even if we keep the alignment based on
mapping size for now, it should probably be deprecated in favour of
min_align_mask.

Petr T