Re: [PATCH] firmware: fix async/manual firmware loading

From: Luis R. Rodriguez
Date: Thu Nov 10 2016 - 14:49:05 EST


On Thu, Nov 10, 2016 at 10:52 AM, Bjorn Andersson
<bjorn.andersson@xxxxxxxxxx> wrote:
> On Thu 10 Nov 08:07 PST 2016, Luis R. Rodriguez wrote:
>
>> On Thu, Nov 10, 2016 at 7:55 AM, Greg Kroah-Hartman
>> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>> > On Wed, Nov 09, 2016 at 09:39:21PM +0100, Luis R. Rodriguez wrote:
>> >> On Sun, Oct 30, 2016 at 03:50:48PM +0100, Yves-Alexis Perez wrote:
>> >> > From: Yves-Alexis Perez <corsac@xxxxxxxxxx>
>> >> >
>> >> > wait_for_completion_interruptible_timeout() return value is either
>> >> > -ERESTARTSYS (in case it was interrupted), 0 (in case the timeout expired)
>> >> > or the number of jiffies left until timeout. The return value is stored in
>> >> > a long, but in _request_firmware_load() it's silently casted to an int,
>> >> > which can overflow and give a negative value, indicating an error.
>> >> >
>> >> > Fix this by re-using the timeout variable and only set retval when it's
>> >> > safe.
>> >>
>> >> Please amend the commit log as I noted in the previous response, and
>> >> resend.
>> >>
>> >> > Signed-off-by: Yves-Alexis Perez <corsac@xxxxxxxxxx>
>> >> > Cc: Ming Lei <ming.lei@xxxxxxxxxxxxx>
>> >> > Cc: "Luis R. Rodriguez" <mcgrof@xxxxxxxxxx>
>> >> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>> >>
>> >> Other than the commit log you can add on you resend:
>> >>
>> >> Acked-by: Luis R. Rodriguez.
>> >>
>> >> Modulo I don't personally thing this this is sable material but I'll let
>> >> Greg decide.
>> >
>> > Does it fix a regression?
>>
>
> Yes
>
>> Not that I am aware of, but if you consider the reported the developer
>> then yes.
>>
>
> I haven't verified that this particular use case actually worked before,
> but this code works with lower timeout values (e.g. 60 in the fallback
> case), so this looks isolated.

This is true, but as I noted the broken aspect was when the timeout
was set to the max value.

> The bug was clearly introduced in v4.0 by:
>
> 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
>
> So please add a Fixes: and
>
> Reviewed-by: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>

This I agree with, thanks for that, and because of this then:

Acked-by: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>

And because of this do recommend it for stable. I would still prefer
at least a new re-submit with the respected tags and a changed commit
log describing the reason for the fix, how the cast is an issue
exactly, and how this is a regression.

>> > A reported issue with an older kernel version
>> > that people have hit?
>>
>> Definitely not.
>>
>> > It shouldn't be hard to figure out if a patch should be in stable or not...
>>
>> Well with the only caveat now that I am suggesting we consider remove
>> this logic completely as only 2 drivers were using it explicitly
>> (second argument to request_firmware_nowait() set to false), it seems
>> they had good reasons for it but ... this has been broken for ages and
>> we seem to be happy to compartamentalize the UMH further, its unclear
>> why we would want to expand and "fix" that instead of just removing
>> crap that never worked. Thoughts?
>>
>
> Please Luis, just stop your crusade on this code. You're grasping at
> every straw of opportunity to get this code out of the kernel,

No, I'm pointing out valid issues the code has had historically and
things folks had not realized. I already knew we could not get rid of
it, but if this was *not* a regression and if this was broken always
then clearly it was something worth considering to just remove. But as
you note, its a regression. Thanks for identifying that.

> but it
> has not been broken for ages, it works just fine and it is ABI.

Agreed.

> I'm very concerned about your mission to to "compartamentalize" this
> code when you're so certain that it's "broken crap".

Well the firmware UMH fallback code is craptastic code, use at your own risk.

Luis