Re: Null pointer crash at find_idlest_group on db845c w/ linus/master

From: Valentin Schneider
Date: Wed Dec 04 2019 - 05:09:25 EST


On 04/12/2019 09:42, Qais Yousef wrote:
> On 12/04/19 09:06, Vincent Guittot wrote:
>> Hi John,
>>
>> On Tue, 3 Dec 2019 at 20:15, John Stultz <john.stultz@xxxxxxxxxx> wrote:
>>>
>>> With today's linus/master on db845c running android, I'm able to
>>> fairly easily reproduce the following crash. I've not had a chance to
>>> bisect it yet, but I'm suspecting its connected to Vincent's recent
>>> rework.
>>
>> Does the crash happen randomly or after a specific action ?
>> I have a db845 so I can try to reproduce it locally.
>
> Isn't there a chance we use local_sgs without initializing it in that function?
> AFAICS we define local_sgs on the stack but not always could be populated with
> the right values. I can't see tmp_sgs being used in the function too. Could
> this cause the/a problem?
>
> 8377 struct sg_lb_stats local_sgs, tmp_sgs;
> .
> .
> .
> 8399 if (local_group) {
> 8400 sgs = &local_sgs;
> 8401 local = group;
> 8402 } else {
> 8403 sgs = &tmp_sgs;
> 8404 }
> 8405
> 8406 update_sg_wakeup_stats(sd, group, sgs, p);
>

local_sgs is initialized in the first update_sg_wakeup_stats() entry
(the local sched_group is always the first one in the sd->groups list),
and tmp_sgs is whatever non local group we're iterating over, see:

if (local_group) {
sgs = &local_sgs;
local = group;
} else {
sgs = &tmp_sgs;
}

and that sgs is populated in

update_sg_wakeup_stats(sd, group, sgs, p);


> Cheers
>
> --
> Qais Youef
>