Re: The issue with page allocation 5.3 rc1-rc2 (seems drm culprit here)

From: Dave Airlie
Date: Sun Aug 04 2019 - 21:03:54 EST


On Mon, 5 Aug 2019 at 08:23, Mikhail Gavrilov
<mikhail.v.gavrilov@xxxxxxxxx> wrote:
>
> Hi folks,
> Two weeks ago when commit 22051d9c4a57 coming to my system.
> Started happen randomly errors:
> "gnome-shell: page allocation failure: order:4,
> mode:0x40cc0(GFP_KERNEL|__GFP_COMP),
> nodemask=(null),cpuset=/,mems_allowed=0"
> Symptoms:
> The screen goes out as in energy saving.
> And it is impossible to wake the computer in a few minutes.
>
> I am making bisect and looks like the first bad commit is 476e955dd679.
> Here full bisect logs: https://mega.nz/#F!kgYFxAIb!v1tcHANPy2ns1lh4LQLeIg
>
> I wrote about my find to the amd-gfx mailing list, but no one answer me.
> Until yesterday, I thought it was a bug in the amdgpu driver.
> But yesterday, after the next occurrence of an error, the system hangs
> completely already with another error.

Does it happen if you disable CONFIG_DRM_AMD_DC_DCN2_0, I'm assuming
you don't have a navi gpu.

I think some struct grew too large in the navi merge, hopefully amd
care, else we have to disable navi before release.

I've directed this at the main AMD devs who might be helpful.

Dave.