Re: [PATCH v2] drm: use mgr->dev in drm_dbg_kms in drm_dp_add_payload_part2

From: Jeff Layton
Date: Sat Apr 22 2023 - 20:35:04 EST


On Wed, 2023-04-19 at 16:21 +0300, Jani Nikula wrote:
> On Wed, 19 Apr 2023, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > I've been experiencing some intermittent crashes down in the display
> > driver code. The symptoms are ususally a line like this in dmesg:
> >
> > amdgpu 0000:30:00.0: [drm] Failed to create MST payload for port 000000006d3a3885: -5
> >
> > ...followed by an Oops due to a NULL pointer dereference.
> >
> > Switch to using mgr->dev instead of state->dev since "state" can be
> > NULL in some cases.
> >
> > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2184855
> > Suggested-by: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx>
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
>
> Thanks,
>
> Reviewed-by: Jani Nikula <jani.nikula@xxxxxxxxx>
>
>
> > ---
> > drivers/gpu/drm/display/drm_dp_mst_topology.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > I've been running this patch for a couple of days, but the problem
> > hasn't occurred again as of yet. It seems sane though as long as we can
> > assume that mgr->dev will be valid even when "state" is a NULL pointer.
> >
> > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > index 38dab76ae69e..e2e21ce79510 100644
> > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > @@ -3404,7 +3404,7 @@ int drm_dp_add_payload_part2(struct drm_dp_mst_topology_mgr *mgr,
> >
> > /* Skip failed payloads */
> > if (payload->vc_start_slot == -1) {
> > - drm_dbg_kms(state->dev, "Part 1 of payload creation for %s failed, skipping part 2\n",
> > + drm_dbg_kms(mgr->dev, "Part 1 of payload creation for %s failed, skipping part 2\n",
> > payload->port->connector->name);
> > return -EIO;
> > }
>

Thanks for the reviews!

I finally had this happen again today, and I can confirm that this does
prevent the oops. GNOME rearranged my screen layout after the error, but
the box stayed up and running.
--
Jeff Layton <jlayton@xxxxxxxxxx>