Re: [PATCH 2/3] udmabuf: Sync buffer mappings for attached devices

From: Andrew Davis
Date: Fri Jan 26 2024 - 12:40:55 EST


On 1/26/24 1:25 AM, Kasireddy, Vivek wrote:
Currently this driver creates a SGT table using the CPU as the
target device, then performs the dma_sync operations against
that SGT. This is backwards to how DMA-BUFs are supposed to behave.
This may have worked for the case where these buffers were given
only back to the same CPU that produced them as in the QEMU case.
And only then because the original author had the dma_sync
operations also backwards, syncing for the "device" on begin_cpu.
This was noticed and "fixed" in this patch[0].

That then meant we were sync'ing from the CPU to the CPU using
a pseudo-device "miscdevice". Which then caused another issue
due to the miscdevice not having a proper DMA mask (and why should
it, the CPU is not a DMA device). The fix for that was an even
more egregious hack[1] that declares the CPU is coherent with
itself and can access its own memory space..

Unwind all this and perform the correct action by doing the dma_sync
operations for each device currently attached to the backing buffer.
Makes sense.


[0] commit 1ffe09590121 ("udmabuf: fix dma-buf cpu access")
[1] commit 9e9fa6a9198b ("udmabuf: Set the DMA mask for the udmabuf
device (v2)")

Signed-off-by: Andrew Davis <afd@xxxxxx>
---
drivers/dma-buf/udmabuf.c | 41 +++++++++++++++------------------------
1 file changed, 16 insertions(+), 25 deletions(-)

diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index 3a23f0a7d112a..ab6764322523c 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -26,8 +26,6 @@ MODULE_PARM_DESC(size_limit_mb, "Max size of a
dmabuf, in megabytes. Default is
struct udmabuf {
pgoff_t pagecount;
struct page **pages;
- struct sg_table *sg;
- struct miscdevice *device;
struct list_head attachments;
struct mutex lock;
};
@@ -169,12 +167,8 @@ static void unmap_udmabuf(struct
dma_buf_attachment *at,
static void release_udmabuf(struct dma_buf *buf)
{
struct udmabuf *ubuf = buf->priv;
- struct device *dev = ubuf->device->this_device;
pgoff_t pg;

- if (ubuf->sg)
- put_sg_table(dev, ubuf->sg, DMA_BIDIRECTIONAL);
What happens if the last importer maps the dmabuf but erroneously
closes it immediately? Would unmap somehow get called in this case?


Good question, had to scan the framework code a bit here. I thought
closing a DMABUF handle would automatically unwind any current
attachments/mappings, but it seems nothing in the framework does that.

Looks like that is up to the importing drivers[0]:

Once a driver is done with a shared buffer it needs to call
dma_buf_detach() (after cleaning up any mappings) and then
release the reference acquired with dma_buf_get() by
calling dma_buf_put().

So closing a DMABUF after mapping without first unmapping it would
be a bug in the importer, it is not the exporters problem to check
It may be a bug in the importer but wouldn't the memory associated
with the sg table and attachment get leaked if unmap doesn't get called
in this scenario?


Yes the attachment data would be leaked if unattach was not called,
but that is true for all DMABUF exporters. The .release() callback
is meant to be the mirror of the export function and it only cleans
up that. Same for attach/unattach, map/unmap, etc.. If these calls
are not balanced then yes they can leak memory.

Since balance is guaranteed by the API, checking the balance should
be done at that level, not in each and every exporter. If your
comment is that we should add those checks into the DMABUF framework
layer then I would agree.

Andrew

Thanks,
Vivek

for (although some more warnings in the framework checking for that
might not be a bad idea..).

Andrew

[0] https://www.kernel.org/doc/html/v6.7/driver-api/dma-buf.html

Thanks,
Vivek

-
for (pg = 0; pg < ubuf->pagecount; pg++)
put_page(ubuf->pages[pg]);
kfree(ubuf->pages);
@@ -185,33 +179,31 @@ static int begin_cpu_udmabuf(struct dma_buf
*buf,
enum dma_data_direction direction)
{
struct udmabuf *ubuf = buf->priv;
- struct device *dev = ubuf->device->this_device;
- int ret = 0;
-
- if (!ubuf->sg) {
- ubuf->sg = get_sg_table(dev, buf, direction);
- if (IS_ERR(ubuf->sg)) {
- ret = PTR_ERR(ubuf->sg);
- ubuf->sg = NULL;
- }
- } else {
- dma_sync_sg_for_cpu(dev, ubuf->sg->sgl, ubuf->sg->nents,
- direction);
- }
+ struct udmabuf_attachment *a;

- return ret;
+ mutex_lock(&ubuf->lock);
+
+ list_for_each_entry(a, &ubuf->attachments, list)
+ dma_sync_sgtable_for_cpu(a->dev, a->table, direction);
+
+ mutex_unlock(&ubuf->lock);
+
+ return 0;
}

static int end_cpu_udmabuf(struct dma_buf *buf,
enum dma_data_direction direction)
{
struct udmabuf *ubuf = buf->priv;
- struct device *dev = ubuf->device->this_device;
+ struct udmabuf_attachment *a;

- if (!ubuf->sg)
- return -EINVAL;
+ mutex_lock(&ubuf->lock);
+
+ list_for_each_entry(a, &ubuf->attachments, list)
+ dma_sync_sgtable_for_device(a->dev, a->table, direction);
+
+ mutex_unlock(&ubuf->lock);

- dma_sync_sg_for_device(dev, ubuf->sg->sgl, ubuf->sg->nents,
direction);
return 0;
}

@@ -307,7 +299,6 @@ static long udmabuf_create(struct miscdevice
*device,
exp_info.priv = ubuf;
exp_info.flags = O_RDWR;

- ubuf->device = device;
buf = dma_buf_export(&exp_info);
if (IS_ERR(buf)) {
ret = PTR_ERR(buf);
--
2.39.2