Re: [RFC PATCH 1/1] drivers: base: Expose probe failures via debugfs

From: Greg Kroah-Hartman
Date: Thu Jun 03 2021 - 09:16:32 EST


On Thu, Jun 03, 2021 at 03:55:34PM +0300, Adrian Ratiu wrote:
> This adds a new devices_failed debugfs attribute to list driver
> probe failures excepting -EPROBE_DEFER which are still handled
> as before via their own devices_deferred attribute.

Who is going to use this?

> This is useful on automated test systems like KernelCI to avoid
> filtering dmesg dev_err() messages to extract potential probe
> failures.

I thought we listed these already some other way today?

> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
> Cc: Guillaume Tucker <gtucker.collabora@xxxxxxxxx>
> Suggested-by: Enric Balletbò <enric.balletbo@xxxxxxxxxxxxx>
> Signed-off-by: Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx>
> ---
> drivers/base/core.c | 76 +++++++++++++++++++++++++++++++++++++++++++--
> lib/Kconfig.debug | 23 ++++++++++++++
> 2 files changed, 96 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index b8a8c96dca58..74bf057234b8 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -9,7 +9,9 @@
> */
>
> #include <linux/acpi.h>
> +#include <linux/circ_buf.h>
> #include <linux/cpufreq.h>
> +#include <linux/debugfs.h>
> #include <linux/device.h>
> #include <linux/err.h>
> #include <linux/fwnode.h>
> @@ -53,6 +55,15 @@ static DEFINE_MUTEX(fwnode_link_lock);
> static bool fw_devlink_is_permissive(void);
> static bool fw_devlink_drv_reg_done;
>
> +#ifdef CONFIG_DEBUG_FS_PROBE_ERR
> +#define PROBE_ERR_BUF_ELEM_SIZE 64
> +#define PROBE_ERR_BUF_SIZE (1 << CONFIG_DEBUG_FS_PROBE_ERR_BUF_SHIFT)
> +static struct circ_buf probe_err_crbuf;
> +static char failed_probe_buffer[PROBE_ERR_BUF_SIZE];
> +static DEFINE_MUTEX(failed_probe_mutex);
> +static struct dentry *devices_failed_probe;
> +#endif

All of this just for a log buffer? The kernel provides a great one,
printk, let's not create yet-another-log-buffer if at all possible
please.

If the existing messages are "hard to parse", what can we do to make
them "easier" that would allow systems to do something with them?

What _do_ systems want to do with this information anyway? What does it
help with exactly?



> +
> /**
> * fwnode_link_add - Create a link between two fwnode_handles.
> * @con: Consumer end of the link.
> @@ -3769,6 +3780,29 @@ struct device *device_find_child_by_name(struct device *parent,
> }
> EXPORT_SYMBOL_GPL(device_find_child_by_name);
>
> +/*
> + * failed_devs_show() - Show devices & drivers which failed to probe.
> + */
> +#ifdef CONFIG_DEBUG_FS_PROBE_ERR

.c files shouldn't have #ifdefs if at all possible, so this patch isn't
good for that reason alone :(


> +static int failed_devs_show(struct seq_file *s, void *data)
> +{
> + size_t offset;
> +
> + mutex_lock(&failed_probe_mutex);
> +
> + for (offset = 0;
> + offset < PROBE_ERR_BUF_SIZE;
> + offset += PROBE_ERR_BUF_ELEM_SIZE)
> + if (probe_err_crbuf.buf[offset])
> + seq_printf(s, "%s\n", probe_err_crbuf.buf + offset);
> +
> + mutex_unlock(&failed_probe_mutex);
> +
> + return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(failed_devs);
> +#endif
> +
> int __init devices_init(void)
> {
> devices_kset = kset_create_and_add("devices", &device_uevent_ops, NULL);
> @@ -3784,6 +3818,12 @@ int __init devices_init(void)
> if (!sysfs_dev_char_kobj)
> goto char_kobj_err;
>
> +#ifdef CONFIG_DEBUG_FS_PROBE_ERR
> + devices_failed_probe = debugfs_create_file("devices_failed", 0444, NULL,
> + NULL, &failed_devs_fops);
> + probe_err_crbuf.buf = failed_probe_buffer;

Nit, no need to save the dentry here, you can look it up if you really
need it later on, but most importantly, you NEVER do anything with this
dentry so why save it at all?

And again, #ifdef is not ok, that makes the code much more
unmaintainable over time.

thanks,

greg k-h