Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

From: Vivek Gautam
Date: Tue Nov 28 2017 - 08:44:13 EST




On 11/28/2017 05:13 AM, Rob Clark wrote:
On Mon, Nov 27, 2017 at 5:22 PM, Stephen Boyd<sboyd@xxxxxxxxxxxxxx> wrote:
On 11/15, Vivek Gautam wrote:
Hi,


On Mon, Aug 7, 2017 at 5:59 PM, Rob Clark<robdclark@xxxxxxxxx> wrote:
On Mon, Aug 7, 2017 at 4:27 AM, Vivek Gautam
<vivek.gautam@xxxxxxxxxxxxxx> wrote:
On Thu, Jul 13, 2017 at 5:20 PM, Rob Clark<robdclark@xxxxxxxxx> wrote:
On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R<sricharan@xxxxxxxxxxxxxx> wrote:
Hi Vivek,

On 7/13/2017 10:43 AM, Vivek Gautam wrote:
Hi Stephen,


On 07/13/2017 04:24 AM, Stephen Boyd wrote:
On 07/06, Vivek Gautam wrote:
@@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
size_t size)
{
- struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+ size_t ret;
if (!ops)
return 0;
- return ops->unmap(ops, iova, size);
+ pm_runtime_get_sync(smmu_domain->smmu->dev);
Can these map/unmap ops be called from an atomic context? I seem
to recall that being a problem before.
That's something which was dropped in the following patch merged in master:
523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock

Looks like we don't need locks here anymore?
Apart from the locking, wonder why a explicit pm_runtime is needed
from unmap. Somehow looks like some path in the master using that
should have enabled the pm ?

Yes, there are a bunch of scenarios where unmap can happen with
disabled master (but not in atomic context).
I would like to understand whether there is a situation where an unmap is
called in atomic context without an enabled master?

Let's say we have the case where all the unmap calls in atomic context happen
only from the master's context (in which case the device link should
take care of
the pm state of smmu), and the only unmap that happen in non-atomic context
is the one with master disabled. In such a case doesn it make sense to
distinguish
the atomic/non-atomic context and add pm_runtime_get_sync()/put_sync() only
for the non-atomic context since that would be the one with master disabled.

At least drm/msm needs to hold obj->lock (a mutex) in unmap, so it
won't unmap anything in atomic ctx (but it can unmap w/ master
disabled). I can't really comment about other non-gpu drivers. It
seems like a reasonable constraint that either master is enabled or
not in atomic ctx.

Currently we actually wrap unmap w/ pm_runtime_get/put_sync(), but I'd
like to drop that to avoid powering up the gpu.
Since the deferring the TLB maintenance doesn't look like the best approach [1],
how about if we try to power-up only the smmu from different client
devices such as,
GPU in the unmap path. Then we won't need to add pm_runtime_get/put() calls in
arm_smmu_unmap().

The client device can use something like - pm_runtime_get_supplier() since
we already have the device link in place with this patch series. This should
power-on the supplier (which is smmu) without turning on the consumer
(such as GPU).

pm_runtime_get_supplier() however is not exported at this moment.
Will it be useful to export this API and use it in the drivers.

I'm not sure pm_runtime_get_supplier() is correct either. That
feels like we're relying on the GPU driver knowing the internal
details of how the device links are configured.

what does pm_runtime_get_supplier() do if IOMMU driver hasn't setup
device-link?

It will be a no-op.

If it is a no-op, then I guess the GPU driver calling
pm_runtime_get_supplier() seems reasonable, and less annoying than
having special cases in pm_resume path.. I don't feel too bad about
having "just in case" get/put_supplier() calls in the unmap path.

Also, presumably we still want to avoid powering up GPU even if we
short circuit the firmware loading and rest of "booting up the GPU"..
since presumably the GPU draws somewhat more power than the IOMMU..
having the pm_resume/suspend path know about the diff between waking
up / suspending the iommu and itself doesn't really feel less-bad than
just doing "just in case" get/put_supplier() calls.

If it sounds okay, then i can send a patch that exports the
pm_runtime_get/put_suppliers() APIs.


Best regards
Vivek

BR,
-R

Is there some way to have the GPU driver know in its runtime PM
resume hook that it doesn't need to be powered on because it
isn't actively drawing anything or processing commands? I'm
thinking of the code calling pm_runtime_get() as proposed around
the IOMMU unmap path in the GPU driver and then having the
runtime PM resume hook in the GPU driver return some special
value to indicate that it didn't really resume because it didn't
need to and to treat the device as runtime suspended but not
return an error. Then the runtime PM core can keep track of that
and try to power the GPU on again when another pm_runtime_get()
is called on the GPU device.

This keeps the consumer API the same, always pm_runtime_get(),
but leaves the device driver logic of what to do when the GPU
doesn't need to power on to the runtime PM hook where the driver
has all the information.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message tomajordomo@xxxxxxxxxxxxxxx
More majordomo info athttp://vger.kernel.org/majordomo-info.html

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project