Re: [PATCH 01/12] dma-buf: add dynamic caching of sg_table

From: Christian KÃnig
Date: Wed May 22 2019 - 13:30:48 EST


Am 22.05.19 um 18:17 schrieb Sumit Semwal:
Hi Christian,

On Sat, 27 Apr 2019 at 05:31, Liam Mark <lmark@xxxxxxxxxxxxxx> wrote:
On Tue, 16 Apr 2019, Christian KÃnig wrote:

To allow a smooth transition from pinning buffer objects to dynamic
invalidation we first start to cache the sg_table for an attachment
unless the driver explicitly says to not do so.

---
drivers/dma-buf/dma-buf.c | 24 ++++++++++++++++++++++++
include/linux/dma-buf.h | 11 +++++++++++
2 files changed, 35 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 7c858020d14b..65161a82d4d5 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -573,6 +573,20 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
list_add(&attach->node, &dmabuf->attachments);

mutex_unlock(&dmabuf->lock);
+
+ if (!dmabuf->ops->dynamic_sgt_mapping) {
+ struct sg_table *sgt;
+
+ sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
+ if (!sgt)
+ sgt = ERR_PTR(-ENOMEM);
+ if (IS_ERR(sgt)) {
+ dma_buf_detach(dmabuf, attach);
+ return ERR_CAST(sgt);
+ }
+ attach->sgt = sgt;
+ }
+
return attach;

err_attach:
@@ -595,6 +609,10 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
if (WARN_ON(!dmabuf || !attach))
return;

+ if (attach->sgt)
+ dmabuf->ops->unmap_dma_buf(attach, attach->sgt,
+ DMA_BIDIRECTIONAL);
+
mutex_lock(&dmabuf->lock);
list_del(&attach->node);
if (dmabuf->ops->detach)
@@ -630,6 +648,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
if (WARN_ON(!attach || !attach->dmabuf))
return ERR_PTR(-EINVAL);

+ if (attach->sgt)
+ return attach->sgt;
+
I am concerned by this change to make caching the sg_table the default
behavior as this will result in the exporter's map_dma_buf/unmap_dma_buf
calls are no longer being called in
dma_buf_map_attachment/dma_buf_unmap_attachment.
Probably this concern from Liam got lost between versions of your
patches; could we please request a reply to these points here?

Sorry I indeed never got this mail, but this is actually not an issue because Daniel had similar concerns and we didn't made this the default in the final version.

This seems concerning to me as it appears to ignore the cache maintenance
aspect of the map_dma_buf/unmap_dma_buf calls.
For example won't this potentially cause issues for clients of ION.

If we had the following
- #1 dma_buf_attach coherent_device
- #2 dma_buf attach non_coherent_device
- #3 dma_buf_map_attachment non_coherent_device
- #4 non_coherent_device writes to buffer
- #5 dma_buf_unmap_attachment non_coherent_device
- #6 dma_buf_map_attachment coherent_device
- #7 coherent_device reads buffer
- #8 dma_buf_unmap_attachment coherent_device

There wouldn't be any CMO at step #5 anymore (specifically no invalidate)
so now at step #7 the coherent_device could read a stale cache line.

Also, now by default dma_buf_unmap_attachment no longer removes the
mappings from the iommu, so now by default dma_buf_unmap_attachment is not
doing what I would expect and clients are losing the potential sandboxing
benefits of removing the mappings.
Shouldn't this caching behavior be something that clients opt into instead
of being the default?

Well, it seems you are making incorrect assumptions about the cache maintenance of DMA-buf here.

At least for all DRM devices I'm aware of mapping/unmapping an attachment does *NOT* have any cache maintenance implications.

E.g. the use case you describe above would certainly fail with amdgpu, radeon, nouveau and i915 because mapping a DMA-buf doesn't stop the exporter from reading/writing to that buffer (just the opposite actually).

All of them assume perfectly coherent access to the underlying memory. As far as I know there is no documented cache maintenance requirements for DMA-buf.

The IOMMU concern on the other hand is certainly valid and I perfectly agree that keeping the mapping time as short as possible is desirable.

Regards,
Christian.

Liam

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Best,
Sumit.