Re: [PATCH] Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"

From: Thorsten Leemhuis
Date: Thu Dec 01 2022 - 04:29:55 EST


Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

Has any progress been made to fix this regression? It afaics is not a
release critical issue, but well, it still would be nice to get this
fixed before 6.1 is released.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

On 24.11.22 12:55, Catalin Marinas wrote:
> On Mon, Nov 21, 2022 at 03:42:27PM +0530, Sibi Sankar wrote:
>> On 11/21/22 12:12, Manivannan Sadhasivam wrote:
>>> On Fri, Nov 18, 2022 at 12:33:49PM +0000, Will Deacon wrote:
>>>> On Fri, Nov 18, 2022 at 04:24:02PM +0530, Manivannan Sadhasivam wrote:
>>>>> On Mon, Nov 14, 2022 at 05:38:00PM +0000, Catalin Marinas wrote:
>>>>>> On Mon, Nov 14, 2022 at 03:14:21PM +0000, Robin Murphy wrote:
>>>>>>> Clearly that driver is completely broken though. If the DMA allocation came
>>>>>>> from a no-map carveout vma_dma_alloc_from_dev_coherent() then the vmap()
>>>>>>> shenanigans wouldn't work, so if it backed by struct pages then the whole
>>>>>>> dance is still pointless because *a cacheable linear mapping exists*, and
>>>>>>> it's just relying on the reduced chance that anything's going to re-fetch
>>>>>>> the linear map address after those pages have been allocated, exactly as I
>>>>>>> called out previously[1].
>>>>>>
>>>>>> So I guess a DMA pool that's not mapped in the linear map, together with
>>>>>> memremap() instead of vmap(), would work around the issue. But the
>>>>>> driver needs fixing, not the arch code.
>>>>>
>>>>> Okay, thanks for the hint. Can you share how to allocate the dma-pool that's
>>>>> not part of the kernel's linear map? I looked into it but couldn't find a way.
>>>>
>>>> The no-map property should take care of this iirc
>>>
>>> Yeah, we have been using it in other places of the same driver. But as per
>>> Sibi, we used dynamic allocation for metadata validation since there was no
>>> memory reserved statically for that.
>>
>> Unlike the other portions in the driver that required statically defined
>> no-map carveouts, metadata just needed a contiguous memory for
>> authentication. Re-using existing carveouts for this metadata region
>> may not work due to modem FW limitations and declaring a new carveout for
>> metadata will break the device tree bindings. That's the reason for
>> using DMA_ATTR_NO_KERNEL_MAPPING for dma_alloc_attr and vmpa/vunmap with
>> VM_FLUSH_RESET_PERMS before passing the memory onto modem. Are there other
>> suggestions for achieving the same without breaking bindings?
>
> Your DMA_ATTR_NO_KERNEL_MAPPING workaround doesn't work, it only makes
> the failure rate smaller. All this attribute does is avoiding creating a
> non-cacheable mapping but you still have the kernel linear mapping in
> place that may be speculatively accessed by the CPU. You were just lucky
> so far not to have hit the issue. So I'd rather see this fixed properly
> with a no-map carveout. Maybe you can reuse an existing carveout if the
> driver already needs some and avoid changing the DT. More complicated
> options include allocating memory and unmapping it from the linear map
> with set_memory_valid(), though that's not exported to modules and it
> also requires the linear map to be pages only, not block mappings.
>
> Yet another option is to have the swiotlb buffer unmapped from the
> kernel linear map and use the bounce buffer for this. That's more
> involved (Robin has some patches, though for a different reason and they
> may not make it upstream).
>