Re: MT76x2U crashes XHCI driver on AMD Ryzen system

From: Stanislaw Gruszka
Date: Mon Mar 04 2019 - 02:10:45 EST


On Thu, Feb 28, 2019 at 02:40:29PM +0100, Joerg Roedel wrote:
> On Thu, Feb 28, 2019 at 01:19:48PM +0100, Stanislaw Gruszka wrote:
> > Nevermind, the patch is wrong, s->dma_address is initalized in sg_num_pages().
>
> Yes, it is. In sg_num_pages() the offset into the IOMMU mapping is
> stored in s->dma_address, taking also the segment boundary mask into
> account. map_sg() later only adds the base-address to that.

I have some more info about the issues in
https://bugzilla.kernel.org/show_bug.cgi?id=202673

We have some bugs in mt76. Apparently we should not use
page_frag_alloc() with size bigger than PAGE_SIZE as page_frag_alloc()
can fallback to single page allocation. And also we should not make
sizes unaligned as pointed in commit:
3bed3cc4156e ("net: Do not allocate page fragments that are not skb aligned"

However after fixing that mt76usb still did not work. To make things
work we had to change rx frag size from 2048 to PAGE_SIZE and change
virt_to_head_page() to virt_to_page() when setting SG's.

I think I understand why first change was needed. If we do 2 separate
dma maps of 2 different buffers in single page i.e (PAGE + off=0
and PAGE + off=2048) it causes problem. So either map_sg() return
error which mt76usb does not handle correctly or there is issue
in AMD IOMMU because two dma maps use the same page.

But I don't understand why the second change was needed. Without
it we have issue with incorrect page->_refcount . It is somehow
related with AMD IOMMU, because on different platforms we do not
have such problems.

Joerg, could you look at this ? Thanks.

Stanislaw