Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

From: Christian König
Date: Tue Jun 22 2021 - 11:24:21 EST




Am 22.06.21 um 17:11 schrieb Jason Gunthorpe:
On Tue, Jun 22, 2021 at 04:12:26PM +0300, Oded Gabbay wrote:

1) Setting sg_page to NULL
2) 'mapping' pages for P2P DMA without going through the iommu
3) Allowing P2P DMA without using the p2p dma API to validate that it
can work at all in the first place.

All of these result in functional bugs in certain system
configurations.

Jason
Hi Jason,
Thanks for the feedback.
Regarding point 1, why is that a problem if we disable the option to
mmap the dma-buf from user-space ?
Userspace has nothing to do with needing struct pages or not

Point 1 and 2 mostly go together, you supporting the iommu is not nice
if you dont have struct pages.

You should study Logan's patches I pointed you at as they are solving
exactly this problem.

In addition, I didn't see any problem with sg_page being NULL in the
RDMA p2p dma-buf code. Did I miss something here ?
No, the design of the dmabuf requires the exporter to do the dma maps
and so it is only the exporter that is wrong to omit all the iommu and
p2p logic.

RDMA is OK today only because nobody has implemented dma buf support
in rxe/si - mainly because the only implementations of exporters don't
set the struct page and are thus buggy.

I will take two GAUDI devices and use one as an exporter and one as an
importer. I want to see that the solution works end-to-end, with real
device DMA from importer to exporter.
I can tell you it doesn't. Stuffing physical addresses directly into
the sg list doesn't involve any of the IOMMU code so any configuration
that requires IOMMU page table setup will not work.

Sure it does. See amdgpu_vram_mgr_alloc_sgt:

        amdgpu_res_first(res, offset, length, &cursor);
        for_each_sgtable_sg((*sgt), sg, i) {
                phys_addr_t phys = cursor.start + adev->gmc.aper_base;
                size_t size = cursor.size;
                dma_addr_t addr;

                addr = dma_map_resource(dev, phys, size, dir,
                                        DMA_ATTR_SKIP_CPU_SYNC);
                r = dma_mapping_error(dev, addr);
                if (r)
                        goto error_unmap;

                sg_set_page(sg, NULL, size, 0);
                sg_dma_address(sg) = addr;
                sg_dma_len(sg) = size;

                amdgpu_res_next(&cursor, cursor.size);
        }

dma_map_resource() does the IOMMU mapping for us.

Regards,
Christian.



Jason