Re: [PATCH] driver core: fix possible missing of device probe

From: Russell King - ARM Linux
Date: Fri Sep 28 2012 - 04:48:35 EST


On Fri, Sep 28, 2012 at 09:01:13AM +0530, anish singh wrote:
> Hello Ming,
> Though I am not an expert in this driver core area but
> I have been following this fix.So have some queries below:
>
> On Fri, Sep 28, 2012 at 6:22 AM, Ming Lei <ming.lei@xxxxxxxxxxxxx> wrote:
> > Inside bus_add_driver(), one device might be added into
> should it not be "driver might be added into"?
> > the bus or probed which is triggered by deferred probe
> > just after completing of driver_attach() and before
> > 'klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers)',
> > so the device won't be probed by this driver.
> So the corresponding device will not be probed.
> >
> > This patch moves the below line
> >
> > 'klist_add_tail(&priv->knode_bus, &bus->p->klist_drivers)'
> >
> > before driver_attach() inside bus_add_driver().
> >
> > So fixes the problem since the below way can guarantee that
> > no probe(dev) may be lost.
> >
> As I understand CPU0 is just calling bus_add_driver and driver_attach
> is called and after that it was pre-empted and cpu1 came into picture.
> Deferred probe started running on cpu1 and it didn't find the driver present
> int the knode_bus and unloaded the driver(why it unloaded is already
> explained by russell in his first post).
> Hope my understanding is correct.
> > CPU0 CPU1
> > driver_register
> > ...
> > write(bus->driver_list)
> > smp_mb()
> > read(bus->device_list)
> > ...
> > device_add
> > /* bus_add_device */
> > write(bus->device_list)
> > smp_mb()
> > /* bus_probe_device*/
> > read(bus->driver_list)

Actually, this description is rubbish. It's not about the SMP barriers,
or read/write device/driver lists, or about two CPUs (my test setup only
has one CPU.)

It's about threads, and the relative timing of those threads through
the driver model code. All in all, what with the error path issue, and
now the blatently wrong description, I'm not gaining much confidence in
Ming Lei.

The below is actually what's happening, according to my analysis - and
you'll notice that the device list has absolutely nothing to do with it:

Thread 0 Thread 1 Thread 2
driver_attach()
bus_for_each_dev()
__driver_attach(, devA)
driver_probe_device(, devA)
really_probe(devA, )
driver_deferred_probe_add(devA)
driver_attach()
bus_for_each_dev()
__driver_attach(, devA)
driver_probe_device(, devB)
really_probe(devB, )
driver_bound(devB)
driver_deferred_probe_trigger()
deferred_probe_work_func()
bus_probe_device(devA)
device_attach(devA)
bus_for_each_drv()
__device_attach(, devA)
*fails to find driver*
*devA dropped from
deferred probe list*
klist_add_tail(klist_drivers)

The reason this fix works is because the order in thread 0 means that
when thread 2 comes to re-probe the device, it does find the driver,
and if the resources that the driver needs are still not found (and it
again returns -EPROBE_DEFER) the device will be placed back on the
deferred probe list.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/