Re: [PATCH] drm/panfrost: Ignore core_mask for poweroff and sync interrupts

From: AngeloGioacchino Del Regno
Date: Thu Nov 23 2023 - 10:14:27 EST


Il 23/11/23 14:51, Boris Brezillon ha scritto:
On Thu, 23 Nov 2023 14:24:57 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>
wrote:


So, while I agree that it'd be slightly more readable as a diff if those
were two different commits I do have reasons against splitting.....

If we just need a quick fix to avoid PWRTRANS interrupts from kicking
in when we power-off the cores, I think we'd be better off dropping
GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
at [re]initialization time, and then have a separate series that fixes
the problem more generically.

But that didn't work:
https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@xxxxxxxxxx/

I meant, your 'ignore-core_mask' fix + the
'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.

So,

https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@xxxxxxxxxxxxx/
+
https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@xxxxxxxxxx/



...while this "full" solution worked:
https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@xxxxxxxxxx/

https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@xxxxxxxxxx/


...so this *is* a "quick fix" already... :-)

It's a half-baked solution for the missing irq-synchronization-on-suspend
issue IMHO. I understand why you want it all in one patch that can serve
as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
even simpler diff (see below), and then fully address the
irq-synhronization-on-suspend issue in a follow-up patchset.

--->8---
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 09f5e1563ebd..6e2d7650cc2b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
}
gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
- gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
+ gpu_write(pfdev, GPU_INT_MASK,
+ GPU_IRQ_MASK_ERROR |
+ GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
+ GPU_IRQ_CLEAN_CACHES_COMPLETED);

...but if we do that, the next patch(es) will contain a partial revert of this
commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...

I'm not sure that it's worth changing this like that, then changing it back right
after :-\

Anyway, if anyone else agrees with doing it and then partially revert, I have no
issues going with this one instead; what I care about ultimately is resolving the
regression ASAP :-)

Cheers,
Angelo

/*
* All in-flight jobs should have released their cycle
@@ -425,11 +428,10 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
void panfrost_gpu_power_off(struct panfrost_device *pfdev)
{
- u64 core_mask = panfrost_get_core_mask(pfdev);
int ret;
u32 val;
- gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
+ gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
val, !val, 1, 1000);
if (ret)
@@ -441,7 +443,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
if (ret)
dev_err(pfdev->dev, "tiler power transition timeout");
- gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
+ gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
val, !val, 0, 1000);
if (ret)