Re: [PATCH] drm/panfrost: fix power transition timeout warnings

From: AngeloGioacchino Del Regno
Date: Mon Mar 25 2024 - 11:37:15 EST


Il 25/03/24 12:36, Christian Hewitt ha scritto:
On 25 Mar 2024, at 2:28 pm, Steven Price <steven.price@xxxxxxx> wrote:

On 22/03/2024 16:45, Christian Hewitt wrote:
Increase the timeout value to prevent system logs on Amlogic boards flooding
with power transition warnings:

[ 13.047638] panfrost ffe40000.gpu: shader power transition timeout
[ 13.048674] panfrost ffe40000.gpu: l2 power transition timeout
[ 13.937324] panfrost ffe40000.gpu: shader power transition timeout
[ 13.938351] panfrost ffe40000.gpu: l2 power transition timeout
...
[39829.506904] panfrost ffe40000.gpu: shader power transition timeout
[39829.507938] panfrost ffe40000.gpu: l2 power transition timeout
[39949.508369] panfrost ffe40000.gpu: shader power transition timeout
[39949.509405] panfrost ffe40000.gpu: l2 power transition timeout

The 2000 value has been found through trial and error testing with devices
using G52 and G31 GPUs.

How close to 2ms did you need in your trial and error testing? I'm
wondering if we should increase it further in case this might still
trigger occasionally?

I backed it off progressively but still saw occasional messages at 1.6ms
so padded it a little with 2ms, and those systems haven’t shown errors
since so I currently see it as a ’safe’ value. The one possible wildcard
is testing with older T820/T628 boards; but that needs to wait until I’m
back home from a long trip and able to test them. The possible theory
being that older/slower systems might require more time. Worst case I’ll
have to send another change.

kbase seems to have a 5s (5000ms!) timeout before it will actually
complain. But equally it doesn't busy wait on the registers in the same
way as panfrost, so the impact to the rest of the system of a long wait
is less.

But 2ms doesn't sound an unreasonable timeout so:

Reviewed-by: Steven Price <steven.price@xxxxxxx>

Fixes: 22aa1a209018 ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
Signed-off-by: Christian Hewitt <christianshewitt@xxxxxxxxx>> ---
drivers/gpu/drm/panfrost/panfrost_gpu.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 9063ce254642..fd8e44992184 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -441,19 +441,19 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)

gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
- val, !val, 1, 1000);
+ val, !val, 1, 2000);
if (ret)
dev_err(pfdev->dev, "shader power transition timeout");

gpu_write(pfdev, TILER_PWROFF_LO, pfdev->features.tiler_present);
ret = readl_relaxed_poll_timeout(pfdev->iomem + TILER_PWRTRANS_LO,
- val, !val, 1, 1000);
+ val, !val, 1, 2000);
if (ret)
dev_err(pfdev->dev, "tiler power transition timeout");

As Angelo points out the tiler probably doesn't need such a long
timeout, but I can't see the harm in consistency so I'm happy with this
change. If my memory of the hardware is correct then the tiler power off
actually does very little and so I wouldn't expect it to take very long.

I’ve seen tiler timeouts once I think and thus included it, but not since
the values were increased. As long as it’s acceptable I won’t over-think
it but if more testing is needed I can look at it more.


Thanks for clarifying.

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>