Re: drm/msm: DisplayPort regressions in 6.8-rc1

From: Johan Hovold
Date: Sat Feb 17 2024 - 10:15:04 EST


On Wed, Feb 14, 2024 at 02:52:06PM +0100, Johan Hovold wrote:
> On Tue, Feb 13, 2024 at 10:00:13AM -0800, Abhinav Kumar wrote:
>
> > I do agree that pm runtime eDP driver got merged that time but I think
> > the issue is either a combination of that along with DRM aux bridge
> > https://patchwork.freedesktop.org/series/122584/ OR just the latter as
> > even that went in around the same time.
>
> Yes, indeed there was a lot of changes that went into the MSM drm driver
> in 6.8-rc1 and since I have not tried to debug this myself I can't say
> for sure which change or changes that triggered this regression (or
> possibly regressions).
>
> The fact that the USB-C/DP PHY appears to be involved
> (/soc@0/phy@88eb000) could indeed point to the series you mentioned.
>
> > Thats why perhaps this issue was not seen with the chromebooks we tested
> > on as they do not use pmic_glink (aux bridge).
> >
> > So we will need to debug this on sc8280xp specifically or an equivalent
> > device which uses aux bridge.
>
> I've hit the NULL-pointer deference three times now in the last few days
> on the sc8280xp CRD. But since it doesn't trigger on every boot it seems
> you need to go back to the series that could potentially have caused
> this regression and review them again. There's clearly something quite
> broken here.

Since Dmitry had trouble reproducing this issue I took a closer look at
the DRM aux bridge series that Abhinav pointed and was able to track
down the bridge regressions and come up with a reproducer. I just posted
a series fixing this here:

https://lore.kernel.org/lkml/20240217150228.5788-1-johan+linaro@xxxxxxxxxx/

As I mentioned in the cover letter, I am still seeing intermittent hard
resets around the time that the DRM subsystem is initialising, which
suggests that we may be dealing with two separate DRM regressions here
however.

If the hard resets are triggered by something like unclocked hardware,
perhaps that bit could this be related to the runtime PM rework?

Johan