Re: [PATCH] BUG-REPORT: snd-hda: hacked-together EPROBE_DEFER support

From: Daniel Vetter
Date: Mon Jun 26 2017 - 12:16:47 EST


On Wed, Jun 21, 2017 at 05:30:10PM +0200, Takashi Iwai wrote:
> On Wed, 21 Jun 2017 17:23:57 +0200,
> Chris Wilson wrote:
> >
> > Quoting Daniel Vetter (2017-06-21 16:08:54)
> > > So back when the i915 power well support landed in
> > >
> > > commit 99a2008d0b32d72dfc2a54e7be1eb698dd2e3bd6
> > > Author: Wang Xingchao <xingchao.wang@xxxxxxxxxxxxxxx>
> > > Date: Thu May 30 22:07:10 2013 +0800
> > >
> > > ALSA: hda - Add power-welll support for haswell HDA
> > >
> > > the logic to handle the cross-module depencies was hand-rolled using a
> > > async work item, and that just doesn't work.
> > >
> > > The correct way to handle cross-module deps is either:
> > > - request_module + failing when the other module isn't there
> > >
> > > OR
> > >
> > > - failing the module load with EPROBE_DEFER.
> > >
> > > You can't mix them, if you do then the entire load path just
> > > busy-spins blowing through cpu cycles forever with no way to stop
> > > this.
> > >
> > > snd-hda-intel does mix it, because the hda codec drivers are loaded
> > > using request_module, but the i915 depency is handled using
> > > PROBE_DEFER (or well, should be, but I haven't found any code at all).
> > > This is a major pain when trying to debug i915 load failures.
> > >
> > > This patch here is a horrible hackish attempt at somewhat correctly
> > > wriing EPROBE_DEFER through. Stuff that's missing:
> > > - Check all the other places where load errors are conveniently
> > > dropped on the floor.
> > > - Also fix up the firmware_cb path.
> > > - Drop the debug noise I've left in to make it clear this isn't
> > > anything for merging.
> >
> > This tames "hdaudio hdaudioC0D0: Unable to bind the codec" which was
> > continuously spewing previously, and now the system is usable again.
>
> Could you give a failing scenario? I'm not opposing to the suggested
> solution, we need to fix the mess in anyway, but I just would like to
> know how to trigger the problem easily.

Disable i915 loading e.g. with i915.modeset=0. Watch how snd-hda*
collective blow through 100% of the cpu time spewing into dmesg (and make
the system completely unuseable for kernel work because you can't find
your own debug printk anymore).

This is on a snb, where we don't even need the cross-module stuff ... But
I think it goes sideways in other cases too, if you simply build but don't
load i915. So every time an i915 breaks module load things become real
painful.

Unfortunately the patch is a bit too big for our fixup branch in drm-tip,
so plan B would be to stop building snd-hda (which will make the intel
audio team unhappy, but mea culpa if they don't fix this mess).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch