Re: [PATCH] driver core: Disable driver deferred probe timeout by default

From: Greg Kroah-Hartman
Date: Mon Nov 14 2022 - 05:54:21 EST


On Mon, Nov 14, 2022 at 11:43:33AM +0100, Javier Martinez Canillas wrote:
> The driver_deferred_probe_timeout value has a long story. It was first set
> to -1 when it was introduced by commit 25b4e70dcce9 ("driver core: allow
> stopping deferred probe after init"), meaning that the driver core would
> defer the probe forever unless a subsystem would opt-in by checking if the
> initcalls where done using the driver_deferred_probe_check_state() helper,
> or if a timeout was explicitly set with a "deferred_probe_timeout" param.
>
> Only the power domain, IOMMU and MDIO subsystems currently opt-in to check
> if the initcalls have completed with driver_deferred_probe_check_state().
>
> Commit c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state()
> logic") then changed the driver_deferred_probe_check_state() helper logic,
> to take into account whether modules have been enabled or not and also to
> return -EPROBE_DEFER if the probe deferred timeout was still running.
>
> Then in commit e2cec7d68537 ("driver core: Set deferred_probe_timeout to a
> longer default if CONFIG_MODULES is set"), the timeout was increased to 30
> seconds if modules are enabled. Because seems that some of the subsystems
> that were opt-in to not return -EPROBE_DEFER after the initcall where done
> could still have dependencies whose drivers were built as a module.
>
> This commit did a fundamental change to how probe deferral worked though,
> since now the default was not to attempt probing for drivers indefinitely
> but instead it would timeout after 30 seconds unless a different timeout
> was set using the "deferred_probe_timeout" parameter.
>
> The behavior was changed even mere with commit ce68929f07de ("driver core:
> Revert default driver_deferred_probe_timeout value to 0"), since the value
> was set to 0 by default. Meaning that the probe deferral would be disabled
> after the initcalls where done. Unless a timeout was set in the cmdline.
>
> Notice that the commit said that it was reverting the default value to 0,
> but this was never 0. The default was -1 at the beginning and then changed
> to 30 in a later commit.
>
> This default value of 0 was reverted again by commit f516d01b9df2 ("Revert
> "driver core: Set default deferred_probe_timeout back to 0."") and set to
> 10 seconds instead. Which was still less than the 30 seconds that was set
> at some point to allow systems with drivers built as modules and loaded by
> user-land to probe drivers that were previously deferred.
>
> The 10 seconds timeout isn't enough for the mentioned systems, for example
> general purpose distributions attempt to build all the possible drivers as
> a module to keep the Linux kernel image small. But that means that in very
> likely that the probe deferral mechanism will timeout and drivers won't be
> probed correctly.

What specific "mentioned systems" have deferred probe drivers that are
failing on the current value? What drivers are causing the long delay
here? No one should be having to wait 10 seconds for a deferred delay
on a real system as that feels really wrong.

Why not fix the drivers that are causing this delay and maybe move them
to be async so as to not slow down the whole boot process?

thanks,

greg k-h