Re: [PATCH v2] tick/broadcast: Do not set oneshot_mask except was_periodic was true

From: Thomas Gleixner
Date: Fri Apr 21 2023 - 17:32:27 EST


On Wed, Apr 19 2023 at 15:36, Frederic Weisbecker wrote:
> Le Sat, Apr 15, 2023 at 11:01:51PM +0200, Thomas Gleixner a écrit :
>> @@ -1020,48 +1021,89 @@ static inline ktime_t tick_get_next_peri
>> /**
>> * tick_broadcast_setup_oneshot - setup the broadcast device
>> */
>> -static void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
>> +static void tick_broadcast_setup_oneshot(struct clock_event_device *bc,
>> + bool from_periodic)
>> {
>> int cpu = smp_processor_id();
>> + ktime_t nexttick = 0;
>>
>> if (!bc)
>> return;
>>
>> /* Set it up only once ! */
>> - if (bc->event_handler != tick_handle_oneshot_broadcast) {
>> - int was_periodic = clockevent_state_periodic(bc);
>> -
>> - bc->event_handler = tick_handle_oneshot_broadcast;
>> -
>> + if (bc->event_handler == tick_handle_oneshot_broadcast) {
>> /*
>> - * We must be careful here. There might be other CPUs
>> - * waiting for periodic broadcast. We need to set the
>> - * oneshot_mask bits for those and program the
>> - * broadcast device to fire.
>> + * The CPU which switches from periodic to oneshot mode
>> + * sets the broadcast oneshot bit for all other CPUs which
>> + * are in the general (periodic) broadcast mask to ensure
>> + * that CPUs which wait for the periodic broadcast are
>> + * woken up.
>> + *
>> + * Clear the bit for the local CPU as the set bit would
>> + * prevent the first tick_broadcast_enter() after this CPU
>> + * switched to oneshot state to program the broadcast
>> + * device.
>> */
>> + tick_broadcast_clear_oneshot(cpu);
>
> So this path is reached when we setup/exchange a new tick device
> on a CPU after the broadcast device has been set to oneshot, right?
>
> Why does it have a specific treatment? Is it for optimization? Or am I
> missing a correctness based reason?

This path is taken during the switch from periodic to oneshot mode. The
way how this works is:

boot()
setup_periodic()
setup_periodic_broadcast()

// From here on everything depends on the periodic broadcasting

highres_clocksource_becomes_available()
tick_clock_notify() <- Set's the .check_clocks bit on all CPUs

Now the first CPU which observes that bit switches to oneshot mode, but
the other CPUs might be waiting for the periodic broadcast at that
point. So the periodic to oneshot transition does:

cpumask_copy(tmpmask, tick_broadcast_mask);
/* Remove the local CPU as it is obviously not idle */
cpumask_clear_cpu(cpu, tmpmask);
cpumask_or(tick_broadcast_oneshot_mask, tick_broadcast_oneshot_mask, tmpmask);

I.e. it makes sure that _ALL_ not yet converted CPUs will get woken up
by the new oneshot broadcast handler.

Now when the other CPUs will observe the check_clock bit after that they
need to clear their bit in the oneshot mask while switching themself
from periodic to oneshot one otherwise the next tick_broadcast_enter()
would do nothing. That's all serialized by broadcast lock, so no race.

But that has nothing to do with switching the underlying clockevent
device. At that point all CPUs are already in oneshot mode and
tick_broadcast_oneshot_mask is correct.

So that will take the other code path:

if (bc->event_handler == tick_handle_oneshot_broadcast) {
// not taken because the new device is not yet set up
return;
}

if (from_periodic) {
// not taken because the switchover already happened
// Here is where the cpumask magic happens
}

> For the case where the other CPUs have already installed their
> tick devices and if that function is called with from_periodic=true,
> the other CPUs will notice the oneshot change on their next call to
> tick_broadcast_enter() thanks to the lock, right? So the tick broadcast
> will keep firing until all CPUs have been through idle once and called
> tick_broadcast_exit(), right? Because only them can clear themselves
> from tick_broadcast_oneshot_mask, am I understanding this correctly?

No. See above. It's about the check_clock bit handling on the other
CPUs.

It seems I failed miserably to explain that coherently with the tons of
comments added. Hrmpf :(

> I'm trying to find the opportunity for a race with dev->next_event
> being seen as too far ahead in the future but can't manage so far...

There is none :)

Thanks,

tglx