Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

From: Ming Lei
Date: Wed Jun 20 2012 - 21:21:07 EST


On Thu, Jun 21, 2012 at 6:37 AM, Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Jun 19, 2012 at 10:00:36AM +0800, Ming Lei wrote:

>> so I marked it as -stable because I have explained how the race can be
>> exploited in reality.
>
> Ok, but as this has been there since when, 2.5, I'll refrain from
> marking it this way, as no one has reported a real problem like this
> before.

So have you agreed on keeping Cc: stable in v3?

>> I still have more many examples in kernel about timeout value...
>
> Yes, I know this, but now you are putting a limit on the amount of time

No, I don't put a limit on it, see my below explanation.

> a probe function can take, when before, we have never had one.  That's
> not something to be taken lightly, and is one I know is not true.
>
>> > Why not just do a real lock and try for forever?
>>
>> IMO, there are two advantages not just doing a real lock for forever:
>>
>> - avoiding buggy device/driver to hang the system
>> - with trylock, we can log the buggy device so that it is a bit
>> easier to troubleshoot the buggy drivers, suppose the bug is
>> only triggered 1 time in one year or more
>
> No, just fix the driver, I don't want to put a time limit on how long

Surely we need to fix the driver, but the problem is that it may be very
difficult to fix the driver without the log introduced in the patch, so why
not take it without obvious side effect?

> probe can take, as we never have in the past and I'm sure that whatever
> we pick, will be wrong for someone.
>
> I have seen devices that take many seconds, and minutes for some if bad
> things happen (i.e. the firmware doesn't download properly).  Don't
> break people's working systems.

Suppose some weird devices may take so much time on its probe callback,
the patch does not break anything, just like before applying the patch:
call its .shutdown callback directly without holding the device lock.

So could we just change the timeout value to a larger one? such as 10
seconds or more?

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/