Re: [PATCH v3 1/2] sched/fair: Add NOHZ balancer flag for nohz.next_balance updates

From: Valentin Schneider
Date: Mon Aug 23 2021 - 08:57:56 EST


On 23/08/21 13:59, Peter Zijlstra wrote:
> On Mon, Aug 23, 2021 at 12:16:59PM +0100, Valentin Schneider wrote:
>
>> Gate NOHZ blocked load
>> update by the presence of NOHZ_STATS_KICK - currently all NOHZ balance
>> kicks will have the NOHZ_STATS_KICK flag set, so no change in behaviour is
>> expected.
>
>> @@ -10572,7 +10572,8 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>> * setting the flag, we are sure to not clear the state and not
>> * check the load of an idle cpu.
>> */
>> - WRITE_ONCE(nohz.has_blocked, 0);
>> + if (flags & NOHZ_STATS_KICK)
>> + WRITE_ONCE(nohz.has_blocked, 0);
>>
>> /*
>> * Ensures that if we miss the CPU, we must see the has_blocked
>> @@ -10594,13 +10595,15 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>> * balancing owner will pick it up.
>> */
>> if (need_resched()) {
>> - has_blocked_load = true;
>> + if (flags & NOHZ_STATS_KICK)
>> + has_blocked_load = true;
>> goto abort;
>> }
>>
>> rq = cpu_rq(balance_cpu);
>>
>> - has_blocked_load |= update_nohz_stats(rq);
>> + if (flags & NOHZ_STATS_KICK)
>> + has_blocked_load |= update_nohz_stats(rq);
>>
>> /*
>> * If time for next balance is due,
>> @@ -10631,8 +10634,9 @@ static void _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
>> if (likely(update_next_balance))
>> nohz.next_balance = next_balance;
>>
>> - WRITE_ONCE(nohz.next_blocked,
>> - now + msecs_to_jiffies(LOAD_AVG_PERIOD));
>> + if (flags & NOHZ_STATS_KICK)
>> + WRITE_ONCE(nohz.next_blocked,
>> + now + msecs_to_jiffies(LOAD_AVG_PERIOD));
>>
>> abort:
>> /* There is still blocked load, enable periodic update */
>
> I'm a bit puzzled by this; that function has:
>
> SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK);
>
> Which:
>
> - isn't updated
> - implies STATS must be set when BALANCE

Yup

>
> the latter gives rise to my confusion; why add that gate on STATS? It
> just doesn't make sense to do a BALANCE and not update STATS.

AFAIA that warning was only there to catch BALANCE && !STATS, so I didn't
tweak it.

Now, you could still end up with

flags == NOHZ_NEXT_KICK

(e.g. nohz.next_balance is in the future, but a new CPU entered NOHZ-idle
and needs its own rq.next_balance collated into the nohz struct)

in which case you don't do any blocked load update, hence the
gate. In v1 I had that piggyback on NOHZ_STATS_KICK, but Vincent noted
that might not be the best given blocked load updates can be time
consuming - hence the separate flag.