Re: [PATCH] drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors

From: Daniel Vetter
Date: Thu Oct 11 2018 - 04:45:49 EST


On Wed, Oct 10, 2018 at 09:32:24PM +0300, Ville Syrjälä wrote:
> On Tue, Oct 09, 2018 at 04:44:24PM -0400, Lyude Paul wrote:
> > It appears when testing my previous fix for some of the legacy
> > modesetting issues with MST, I misattributed some kernel splats that
> > started appearing on my machine after a rebase as being from upstream.
> > But it appears they actually came from my patch series:
> >
> > [ 2.980512] [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] Updating routing for [CONNECTOR:65:eDP-1]
> > [ 2.980516] [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] [CONNECTOR:65:eDP-1] is not registered
> > [ 2.980516] ------------[ cut here ]------------
> > [ 2.980519] Could not determine valid watermarks for inherited state
> > [ 2.980553] WARNING: CPU: 3 PID: 551 at drivers/gpu/drm/i915/intel_display.c:14983 intel_modeset_init+0x14d7/0x19f0 [i915]
> > [ 2.980556] Modules linked in: i915(O+) i2c_algo_bit drm_kms_helper(O) syscopyarea sysfillrect sysimgblt fb_sys_fops drm(O) intel_rapl x86_pkg_temp_thermal iTCO_wdt wmi_bmof coretemp crc32_pclmul psmouse i2c_i801 mei_me mei i2c_core lpc_ich mfd_core tpm_tis tpm_tis_core wmi tpm thinkpad_acpi pcc_cpufreq video ehci_pci crc32c_intel serio_raw ehci_hcd xhci_pci xhci_hcd
> > [ 2.980577] CPU: 3 PID: 551 Comm: systemd-udevd Tainted: G O 4.19.0-rc7Lyude-Test+ #1
> > [ 2.980579] Hardware name: LENOVO 20BWS1KY00/20BWS1KY00, BIOS JBET63WW (1.27 ) 11/10/2016
> > [ 2.980605] RIP: 0010:intel_modeset_init+0x14d7/0x19f0 [i915]
> > [ 2.980607] Code: 89 df e8 ec 27 02 00 e9 24 f2 ff ff be 03 00 00 00 48 89 df e8 da 27 02 00 e9 26 f2 ff ff 48 c7 c7 c8 d1 34 a0 e8 23 cf dc e0 <0f> 0b e9 7c fd ff ff f6 c4 04 0f 85 37 f7 ff ff 48 8b 83 60 08 00
> > [ 2.980611] RSP: 0018:ffffc90000287988 EFLAGS: 00010282
> > [ 2.980614] RAX: 0000000000000000 RBX: ffff88031b488000 RCX: 0000000000000006
> > [ 2.980617] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff880321ad54d0
> > [ 2.980620] RBP: ffffc90000287a10 R08: 000000000000040a R09: 0000000000000065
> > [ 2.980623] R10: ffff88030ebb8f00 R11: ffffffff81416590 R12: ffff88031b488000
> > [ 2.980626] R13: ffff88031b4883a0 R14: ffffc900002879a8 R15: ffff880319099800
> > [ 2.980630] FS: 00007f475620d180(0000) GS:ffff880321ac0000(0000) knlGS:0000000000000000
> > [ 2.980633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2.980636] CR2: 00007f9ef28018a0 CR3: 000000031b72c001 CR4: 00000000003606e0
> > [ 2.980639] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 2.980642] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 2.980645] Call Trace:
> > [ 2.980675] i915_driver_load+0xb0e/0xdc0 [i915]
> > [ 2.980681] ? kernfs_add_one+0xe7/0x130
> > [ 2.980709] i915_pci_probe+0x46/0x60 [i915]
> > [ 2.980715] pci_device_probe+0xd4/0x150
> > [ 2.980719] really_probe+0x243/0x3b0
> > [ 2.980722] driver_probe_device+0xba/0x100
> > [ 2.980726] __driver_attach+0xe4/0x110
> > [ 2.980729] ? driver_probe_device+0x100/0x100
> > [ 2.980733] bus_for_each_dev+0x74/0xb0
> > [ 2.980736] driver_attach+0x1e/0x20
> > [ 2.980739] bus_add_driver+0x159/0x230
> > [ 2.980743] ? 0xffffffffa0393000
> > [ 2.980746] driver_register+0x70/0xc0
> > [ 2.980749] ? 0xffffffffa0393000
> > [ 2.980753] __pci_register_driver+0x57/0x60
> > [ 2.980780] i915_init+0x55/0x58 [i915]
> > [ 2.980785] do_one_initcall+0x4a/0x1c4
> > [ 2.980789] ? do_init_module+0x27/0x210
> > [ 2.980793] ? kmem_cache_alloc_trace+0x131/0x190
> > [ 2.980797] do_init_module+0x60/0x210
> > [ 2.980800] load_module+0x2063/0x22e0
> > [ 2.980804] ? vfs_read+0x116/0x140
> > [ 2.980807] ? vfs_read+0x116/0x140
> > [ 2.980811] __do_sys_finit_module+0xbd/0x120
> > [ 2.980814] ? __do_sys_finit_module+0xbd/0x120
> > [ 2.980818] __x64_sys_finit_module+0x1a/0x20
> > [ 2.980821] do_syscall_64+0x5a/0x110
> > [ 2.980824] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 2.980826] RIP: 0033:0x7f4754e32879
> > [ 2.980828] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f7 45 2c 00 f7 d8 64 89 01 48
> > [ 2.980831] RSP: 002b:00007fff43fd97d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> > [ 2.980834] RAX: ffffffffffffffda RBX: 0000559a44ca64f0 RCX: 00007f4754e32879
> > [ 2.980836] RDX: 0000000000000000 RSI: 00007f475599f4cd RDI: 0000000000000018
> > [ 2.980838] RBP: 00007f475599f4cd R08: 0000000000000000 R09: 0000000000000000
> > [ 2.980839] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000000
> > [ 2.980841] R13: 0000559a44c92fd0 R14: 0000000000020000 R15: 0000000000000000
> > [ 2.980881] WARNING: CPU: 3 PID: 551 at drivers/gpu/drm/i915/intel_display.c:14983 intel_modeset_init+0x14d7/0x19f0 [i915]
> > [ 2.980884] ---[ end trace 5eb47a76277d4731 ]---
> >
> > The cause of this appears to be due to the fact that if there's
> > pre-existing display state that was set by the BIOS when i915 loads, it
> > will attempt to perform a modeset before the driver is registered with
> > userspace. Since this happens before the driver's registered with
> > userspace, it's connectors are also unregistered and thus-states which
> > would turn on DPMS on a connector end up getting rejected since the
> > connector isn't registered.
> >
> > These bugs managed to get past Intel's CI partially due to the fact it
> > never ran a full test on my patches for some reason, but also because
> > all of the tests unload the GPU once before running.
>
> Well, not all all tests, but all the module reload tests. If we can
> change the setup so that the driver isn't loaded until the first test
> executes we should be able catch all these issues as well. Which would
> be nice.
>
> > Since this bug is
> > only really triggered when the drivers tries to perform a modeset before
> > it's been fully registered with userspace when coming from whatever
> > display configuration the firmware left us with, it likely would never
> > have been picked up by CI in the first place.
> >
> > After some discussion with vsyrjala, we decided the best course of
> > action would be to just move the unregistered connector checks out of
> > update_connector_routing() and into drm_atomic_set_crtc_for_connector().
> > The reason for this being that legacy modesetting isn't going to be
> > expecting failures anywhere (at least this is the case with X), so
> > ideally we want to ensure that any DPMS changes will still work even on
> > unregistered connectors. Instead, we now only reject new modesets which
> > would change the current CRTC assigned to an unregistered connector
> > unless no new CRTC is being assigned to replace the connector's previous
> > one.
> >
> > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx>
> > Reported-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
> > Fixes: 4d80273976bf ("drm/atomic_helper: Disallow new modesets on unregistered connectors")
> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> > Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > ---
> > drivers/gpu/drm/drm_atomic_helper.c | 21 +--------------------
> > drivers/gpu/drm/drm_atomic_uapi.c | 21 +++++++++++++++++++++
> > 2 files changed, 22 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index e6a2cf72de5e..6f66777dca4b 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -319,26 +319,6 @@ update_connector_routing(struct drm_atomic_state *state,
> > return 0;
> > }
> >
> > - crtc_state = drm_atomic_get_new_crtc_state(state,
> > - new_connector_state->crtc);
> > - /*
> > - * For compatibility with legacy users, we want to make sure that
> > - * we allow DPMS On->Off modesets on unregistered connectors. Modesets
> > - * which would result in anything else must be considered invalid, to
> > - * avoid turning on new displays on dead connectors.
> > - *
> > - * Since the connector can be unregistered at any point during an
> > - * atomic check or commit, this is racy. But that's OK: all we care
> > - * about is ensuring that userspace can't do anything but shut off the
> > - * display on a connector that was destroyed after its been notified,
> > - * not before.
> > - */
> > - if (!READ_ONCE(connector->registered) && crtc_state->active) {
> > - DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not registered\n",
> > - connector->base.id, connector->name);
> > - return -EINVAL;
> > - }
> > -
> > funcs = connector->helper_private;
> >
> > if (funcs->atomic_best_encoder)
> > @@ -383,6 +363,7 @@ update_connector_routing(struct drm_atomic_state *state,
> >
> > set_best_encoder(state, new_connector_state, new_encoder);
> >
> > + crtc_state = drm_atomic_get_new_crtc_state(state, new_connector_state->crtc);
> > crtc_state->connectors_changed = true;
> >
> > DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] using [ENCODER:%d:%s] on [CRTC:%d:%s]\n",
> > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
> > index d5b7f315098c..7acec863b10c 100644
> > --- a/drivers/gpu/drm/drm_atomic_uapi.c
> > +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> > @@ -299,6 +299,27 @@ drm_atomic_set_crtc_for_connector(struct drm_connector_state *conn_state,
> > struct drm_connector *connector = conn_state->connector;
> > struct drm_crtc_state *crtc_state;
> >
> > + /*
> > + * For compatibility with legacy users, we want to make sure that
> > + * we allow DPMS On<->Off modesets on unregistered connectors, since
> > + * legacy modesetting users will not be expecting these to fail. We do
> > + * not however, want to allow legacy users to assign a connector
> > + * that's been unregistered from sysfs to another CRTC, since doing
> > + * this with a now non-existant connector could potentially leave us
> > + * in an invalid state.
> > + *
> > + * Since the connector can be unregistered at any point during an
> > + * atomic check or commit, this is racy. But that's OK: all we care
> > + * about is ensuring that userspace can't use this connector for new
> > + * configurations after it's been notified that the connector is no
> > + * longer present.
> > + */
> > + if (!READ_ONCE(connector->registered) && crtc) {
> > + DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not registered\n",
> > + connector->base.id, connector->name);
> > + return -EINVAL;
>
> I was also pondering whether we should make this ENOENT to match the
> errno we use for the "this connector doesn't exist" case. But I guess
> it doesn't really matter.
>
> Reviewed-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>

This still allows modeset changes and all kinds of nasty things through,
which I think we don't want to. Anything that results in a full modeset
could go into your encoder->enable hooks and go boom because the mst sink
is gone.

I think what we need is a new connector->unplugged which we only set on
_unregister(). Or maybe changed the connector->registered to a tri-state.
That way we can reject modesets after stuff is gone, while not having to
reject modesets when it's not yet fully loaded (which some drivers need).
-Daniel

>
> > + }
> > +
> > if (conn_state->crtc == crtc)
> > return 0;
> >
> > --
> > 2.17.1
>
> --
> Ville Syrjälä
> Intel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch