Re: [RFC PATCH v3] Use kernfs_break_active_protection() for device online store callbacks

From: Li Zhong
Date: Wed Apr 16 2014 - 23:06:40 EST


On Wed, 2014-04-16 at 11:17 -0400, Tejun Heo wrote:
> Hello,
>
> On Wed, Apr 16, 2014 at 09:41:40AM +0800, Li Zhong wrote:
> > > If so, that is
> > > an actually possible deadlock, no?
> >
> > Yes, but it seems to me that it is solved in commit 5e33bc41, which uses
> > lock_device_hotplug_sysfs() to return a restart syscall error if not
> > able to try lock the device_hotplug_lock. That also requires the device
> > removing code path to take the device_hotplug_lock.
>
> But that patch only takes out device_hotplug_lock out of the
> dependency graph and does nothing for cpu_add_remove_lock. It seems
> to be that there still is a deadlock condition involving s_active and
> cpu_add_remove_lock. Am I missing something here?

It seems to me cpu_add_remove_lock is always taken after
device_hotplug_lock.

So if cpu_add_remove_lock has been acquired by device removing process,
then it means the other online/offline process couldn't successfully try
lock device_hotplug_lock, and will release s_active with a restart
syscall error;

if cpu_add_remove_lock has been acquired by online/offline process, then
it should already hold device_hotlug_lock, and keeps the device removing
process waiting at device_hotplug_lock. So online/offline process could
release the lock, and finally release s_active soon.

But after some further thinking, I seem to understand your point.
s_active has lock order problem with the other series of hotplug related
locks, so it's better to take s_active out of the dependency chain,
rather than the first of the other series of locks? like you suggested
below.

>
> Now that kernfs has a proper mechanism to deal with it, wouldn't it
> make more sense to replace 5e33bc41 with prper s_active protection
> breaking?

I'll try this way and send you the code for review.

Thanks,
Zhong

>
> Thanks.
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/