Re: [PATCH] drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()

From: Krzysztof Kozlowski
Date: Tue Nov 21 2023 - 10:34:44 EST


On 08/11/2023 14:20, Steven Price wrote:
> On 02/11/2023 14:15, AngeloGioacchino Del Regno wrote:
>> The layout of the registers {TILER,SHADER,L2}_PWROFF_LO, used to request
>> powering off cores, is the same as the {TILER,SHADER,L2}_PWRON_LO ones:
>> this means that in order to request poweroff of cores, we are supposed
>> to write a bitmask of cores that should be powered off!
>> This means that the panfrost_gpu_power_off() function has always been
>> doing nothing.
>>
>> Fix powering off the GPU by writing a bitmask of the cores to poweroff
>> to the relevant PWROFF_LO registers and then check that the transition
>> (from ON to OFF) has finished by polling the relevant PWRTRANS_LO
>> registers.
>>
>> While at it, in order to avoid code duplication, move the core mask
>> logic from panfrost_gpu_power_on() to a new panfrost_get_core_mask()
>> function, used in both poweron and poweroff.
>>
>> Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx>


Hi,

This commit was added to next recently but it causes "external abort on
non-linefetch" during boot of my Odroid HC1 board.

At least bisect points to it.

If fixed, please add:

Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>

[ 4.861683] 8<--- cut here ---
[ 4.863429] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf0c8802c
[ 4.871018] [f0c8802c] *pgd=433ed811, *pte=11800653, *ppte=11800453
...
[ 5.164010] panfrost_gpu_irq_handler from __handle_irq_event_percpu+0xcc/0x31c
[ 5.171276] __handle_irq_event_percpu from handle_irq_event+0x38/0x80
[ 5.177765] handle_irq_event from handle_fasteoi_irq+0x9c/0x250
[ 5.183743] handle_fasteoi_irq from generic_handle_domain_irq+0x28/0x38
[ 5.190417] generic_handle_domain_irq from gic_handle_irq+0x88/0xa8
[ 5.196741] gic_handle_irq from generic_handle_arch_irq+0x34/0x44
[ 5.202893] generic_handle_arch_irq from __irq_svc+0x8c/0xd0

Full log:
https://krzk.eu/#/builders/21/builds/4392/steps/11/logs/serial0

1. exynos_defconfig
2. HW: Odroid HC1
ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC
arm,mali-t628

Bisect log:

git bisect start
# bad: [07b677953b9dca02928be323e2db853511305fa9] Add linux-next specific files for 20231121
git bisect bad 07b677953b9dca02928be323e2db853511305fa9
# good: [98b1cc82c4affc16f5598d4fa14b1858671b2263] Linux 6.7-rc2
git bisect good 98b1cc82c4affc16f5598d4fa14b1858671b2263
# good: [13e2401d5bdc7f5a30f2651c99f0e3374cdda815] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
git bisect good 13e2401d5bdc7f5a30f2651c99f0e3374cdda815
# bad: [3b586cd6d8e51c428675312e7c3f634eb96337e9] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply.git
git bisect bad 3b586cd6d8e51c428675312e7c3f634eb96337e9
# bad: [9d63fd5f05248c78d9a66ce5dbc9cf5649054848] Merge branch 'drm-next' of https://gitlab.freedesktop.org/agd5f/linux
git bisect bad 9d63fd5f05248c78d9a66ce5dbc9cf5649054848
# bad: [5dea0c3fedee65413271a5700e653eff633e9a7f] drm/panel-elida-kd35t133: Drop shutdown logic
git bisect bad 5dea0c3fedee65413271a5700e653eff633e9a7f
# good: [48d45fac3940347becd290b96b2fc6d5ad8171f7] accel/ivpu: Remove support for uncached buffers
git bisect good 48d45fac3940347becd290b96b2fc6d5ad8171f7
# bad: [809ef191ee600e8bcbe2f8a769e00d2d54c16094] drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm
git bisect bad 809ef191ee600e8bcbe2f8a769e00d2d54c16094
# good: [a78422e9dff366b3a46ae44caf6ec8ded9c9fc2f] drm/sched: implement dynamic job-flow control
git bisect good a78422e9dff366b3a46ae44caf6ec8ded9c9fc2f
# bad: [e4178256094a76cc36d9b9aabe7482615959b26f] drm/virtio: use uint64_t more in virtio_gpu_context_init_ioctl
git bisect bad e4178256094a76cc36d9b9aabe7482615959b26f
# bad: [56e76c0179185568049913257c18069293f8bde9] drm/panfrost: Implement ability to turn on/off GPU clocks in suspend
git bisect bad 56e76c0179185568049913257c18069293f8bde9
# bad: [57d4e26717b030fd794df3534e6b2e806eb761e4] drm/panfrost: Perform hard reset to recover GPU if soft reset fails
git bisect bad 57d4e26717b030fd794df3534e6b2e806eb761e4
# bad: [22aa1a209018dc2eca78745f7666db63637cd5dc] drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()
git bisect bad 22aa1a209018dc2eca78745f7666db63637cd5dc
# first bad commit: [22aa1a209018dc2eca78745f7666db63637cd5dc] drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()


Best regards,
Krzysztof