Re: [PATCH v10 07/11] drm/etnaviv: Add support for the dma coherent device

From: Lucas Stach
Date: Wed Jun 21 2023 - 11:59:17 EST


Am Mittwoch, dem 21.06.2023 um 23:30 +0800 schrieb Sui Jingfeng:
> Hi,
>
> On 2023/6/21 18:00, Lucas Stach wrote:
> > > dma_sync_sgtable_for_cpu(dev->dev, etnaviv_obj->sgt,
> > > etnaviv_op_to_dma_dir(op));
> > > etnaviv_obj->last_cpu_prep_op = op;
> > > @@ -408,8 +421,9 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)
> > > {
> > > struct drm_device *dev = obj->dev;
> > > struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
> > > + struct etnaviv_drm_private *priv = dev->dev_private;
> > >
> > > - if (etnaviv_obj->flags & ETNA_BO_CACHED) {
> > > + if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) {
> > > /* fini without a prep is almost certainly a userspace error */
> > > WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
> > > dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt,
> > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > index 3524b5811682..754126992264 100644
> > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > @@ -112,11 +112,16 @@ static const struct etnaviv_gem_ops etnaviv_gem_prime_ops = {
> > > struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev,
> > > struct dma_buf_attachment *attach, struct sg_table *sgt)
> > > {
> > > + struct etnaviv_drm_private *priv = dev->dev_private;
> > > struct etnaviv_gem_object *etnaviv_obj;
> > > size_t size = PAGE_ALIGN(attach->dmabuf->size);
> > > + u32 cache_flags = ETNA_BO_WC;
> > > int ret, npages;
> > >
> > > - ret = etnaviv_gem_new_private(dev, size, ETNA_BO_WC,
> > > + if (priv->dma_coherent)
> > > + cache_flags = ETNA_BO_CACHED;
> > > +
> > Drop this change. Instead etnaviv_gem_new_impl() should do the upgrade
> > from WC to CACHED as necessary by adding something like this:
>
> I understand you are a profession person in vivante GPU driver domain.
>
> I respect you reviews and instruction.
>
> But, I'm really reluctant to agree with this, is there any space to
> negotiate?
>
> > /*
> > * Upgrade WC to CACHED when the device is hardware coherent and the
> > * platform doesn't allow mixing cached and writecombined mappings to
> > * the same memory area.
> > */
> > if ((flags & ETNA_BO_CACHE_MASK) == ETNA_BO_WC &&
> > dev_is_dma_coherent(dev) && !drm_arch_can_wc_memory())
> > flags = (flags & ~ETNA_BO_CACHE_MASK) & ETNA_BO_CACHED;
>
> This is policy, not a mechanism.
>
> Using what cache property is a user-space program's choice.
>
> While you are override the WC with CACHED mapping. This is not correct
> in the concept!
>
Please explain why you think that this isn't correct. If using WC
mappings cause a potential loss of coherency on your platform, then we
can not allow the userspace driver to use WC mappings.

As I would like to keep the option of WC mappings, I've asked you if
there are ways to prepare the cache in a way that WC mappings aren't
causing any troubles on your platform. You told me that this might be
possible but needs confirmation from a HW engineer and such
confirmation could take a long time.

With that in mind, our only option right now is to upgrade the mappings
to cached in order to not lay out traps for the userspace driver.

> you approach forbidden any possibility to use the WC BO at anywhere.
>
>
> My approach need only check once, while you approach need at least 3
> check plus
>
> so much bit-wise logic operations,  plus a function call  (&, ==, &&, 
> &, ~, &) .
>
> and every time you create a BO. This nasty judgement happens.
>
BO creation again is not a fast path. You are committing to allocate
new memory, which is a few orders of magnitude more costly than the few
instructions needed for those comparisons.

>
> Please keep our original implement, it's simple and clear, Please?
>

It isn't as simple and clear for the userspace interface. It allows
userspace to use WC mappings that would potentially cause loss of
coherency between CPU and GPU, which isn't acceptable.

Regards,
Lucas