Re: Linux 6.1-rc1 drm/amdgpu regression

From: Shuah Khan
Date: Wed Oct 19 2022 - 21:17:48 EST


On 10/19/22 15:24, Deucher, Alexander wrote:
[Public]

-----Original Message-----
From: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
Sent: Wednesday, October 19, 2022 5:00 PM
To: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; linux-
kernel@xxxxxxxxxxxxxxx; Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
Subject: Re: Linux 6.1-rc1 drm/amdgpu regression

On 10/19/22 14:27, Deucher, Alexander wrote:
[AMD Official Use Only - General]

-----Original Message-----
From: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
Sent: Wednesday, October 19, 2022 4:00 PM
To: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; Shuah Khan
<skhan@xxxxxxxxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
Subject: Linux 6.1-rc1 drm/amdgpu regression

Hi Alex,

I am seeing the same problem I sent reverts for on 5.10.147 on Linux
6.1-rc1 on my laptop with AMD Ryzen 7 PRO 5850U with Radeon Graphics.

commit e3163bc8ffdfdb405e10530b140135b2ee487f89
Author: Alex Deucher <alexander.deucher@xxxxxxx>
Date: Fri Sep 9 11:53:27 2022 -0400

drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for
vega

I see that the following has been reverted in Linux 6.1-rc1

commit 66f99628eb24409cb8feb5061f78283c8b65f820
Author: Hamza Mahfooz <hamza.mahfooz@xxxxxxx>
Date: Tue Sep 6 15:01:49 2022 -0400

drm/amdgpu: use dirty framebuffer helper

However I still see the following filling dmesg and system is unusable.
For now I switched back to Linux 6.0 as this is my primary system.

[drm] Fence fallback timer expired on ring sdma0 [drm] Fence fallback
timer expired on ring gfx [drm] Fence fallback timer expired on ring
sdma0 [drm] Fence fallback timer expired on ring gfx [drm] Fence
fallback timer expired on ring sdma0 [drm] Fence fallback timer
expired on ring sdma0 [drm] Fence fallback timer expired on ring
sdma0 [drm] Fence fallback timer expired on ring gfx

Please let me know if I should send revert for this for the mainline as well.


Can you file a bug report
(https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
ab.freedesktop.org%2Fdrm%2Famd%2F-
%2Fissues&amp;data=05%7C01%7CAlexander.Deucher%40amd.com%7C61b
64b1be7294b27eb2308dab214dbe2%7C3dd8961fe4884e608e11a82d994e183d
%7C0%7C0%7C638018099904584274%7CUnknown%7CTWFpbGZsb3d8eyJWIj
oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
000%7C%7C%7C&amp;sdata=ZYA0bWZAGsxB91Bqcg1YAI704LhpISQX63bE67
UVO%2Bs%3D&amp;reserved=0) and attach your dmesg output? I'd like to
try and repro the issue if I can and provide some patches to test. I'd like to
avoid reverting the patch as that will break the driver for users using vega
dGPUs.

Makes sense. I will file the bug and aattach dmesg. Since this is my primary
system, there will be some delay in getting this info. to you and testing any
patches you provide for testing.


Actually I think I see what's wrong. Can you try the attached patch?


This patch worked. Clean boot without any warns and timer expiry messages
from drm/amdgpu.

thanks,
-- Shuah