Re: next-0519 on thinkpad x60: sound related? window manager crash

From: David Rientjes
Date: Wed Jun 10 2020 - 01:26:51 EST


On Tue, 9 Jun 2020, Christoph Hellwig wrote:

> > Working theory is that CONFIG_DMA_NONCOHERENT_MMAP getting set is causing
> > the error_code in the page fault path. Debugging with Alex off-thread we
> > found that dma_{alloc,free}_from_pool() are not getting called from the
> > new code in dma_direct_{alloc,free}_pages() and he has not enabled
> > mem_encrypt.
>
> While DMA_COHERENT_POOL absolutely should not select DMA_NONCOHERENT_MMAP
> (and you should send your patch either way), I don't think it is going
> to make a difference here, as DMA_NONCOHERENT_MMAP just means we
> allows mmaps even for non-coherent devices, and we do not support
> non-coherent devices on x86.
>

We haven't heard yet whether the disabling of DMA_NONCOHERENT_MMAP fixes
Aaron's BUG(), and the patch included some other debugging hints that will
be printed out in case it didn't, but I'll share what we figured out:

In 5.7, his config didn't have DMA_DIRECT_REMAP or DMA_REMAP (it did have
GENERIC_ALLOCATOR already). AMD_MEM_ENCRYPT is set.

In Linus HEAD, AMD_MEM_ENCRYPT now selects DMA_COHERENT_POOL so it sets
the two aforementioned options.

We also figured out that dma_should_alloc_from_pool() is always false up
until the BUG(). So what else changed? Only the selection of DMA_REMAP
and DMA_NONCOHERENT_MMAP.

The comment in the Kconfig about setting "an uncached bit in the
pagetables" led me to believe it may be related to the splat he's seeing
(reserved bit violation). So I suggested dropping DMA_NONCOHERENT_MMAP
from his Kconfig for testing purposes.



If this option should not implicitly be set for DMA_COHERENT_POOL, then I
assume we need yet another Kconfig option since DMA_REMAP selected it
before and DMA_COHERENT_POOL selects DMA_REMAP :)

So do we want a DMA_REMAP_BUT_NO_DMA_NONCOHERENT_MMAP? Decouple DMA_REMAP
from DMA_NONCOHERENT_MMAP and select the latter wherever the former was
set (but not DMA_COHERENT_POOL)? Something else?