Re: [PATCH v2] tick/broadcast: Do not set oneshot_mask except was_periodic was true

From: Frederic Weisbecker
Date: Wed Apr 19 2023 - 09:37:09 EST


Le Sat, Apr 15, 2023 at 11:01:51PM +0200, Thomas Gleixner a écrit :
> @@ -1020,48 +1021,89 @@ static inline ktime_t tick_get_next_peri
> /**
> * tick_broadcast_setup_oneshot - setup the broadcast device
> */
> -static void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
> +static void tick_broadcast_setup_oneshot(struct clock_event_device *bc,
> + bool from_periodic)
> {
> int cpu = smp_processor_id();
> + ktime_t nexttick = 0;
>
> if (!bc)
> return;
>
> /* Set it up only once ! */
> - if (bc->event_handler != tick_handle_oneshot_broadcast) {
> - int was_periodic = clockevent_state_periodic(bc);
> -
> - bc->event_handler = tick_handle_oneshot_broadcast;
> -
> + if (bc->event_handler == tick_handle_oneshot_broadcast) {
> /*
> - * We must be careful here. There might be other CPUs
> - * waiting for periodic broadcast. We need to set the
> - * oneshot_mask bits for those and program the
> - * broadcast device to fire.
> + * The CPU which switches from periodic to oneshot mode
> + * sets the broadcast oneshot bit for all other CPUs which
> + * are in the general (periodic) broadcast mask to ensure
> + * that CPUs which wait for the periodic broadcast are
> + * woken up.
> + *
> + * Clear the bit for the local CPU as the set bit would
> + * prevent the first tick_broadcast_enter() after this CPU
> + * switched to oneshot state to program the broadcast
> + * device.
> */
> + tick_broadcast_clear_oneshot(cpu);

So this path is reached when we setup/exchange a new tick device
on a CPU after the broadcast device has been set to oneshot, right?

Why does it have a specific treatment? Is it for optimization? Or am I
missing a correctness based reason?

> + }
> +
> +
> + bc->event_handler = tick_handle_oneshot_broadcast;
> + bc->next_event = KTIME_MAX;
> +
> + /*
> + * When the tick mode is switched from periodic to oneshot it must
> + * be ensured that CPUs which are waiting for periodic broadcast
> + * get their wake-up at the next tick. This is achieved by ORing
> + * tick_broadcast_mask into tick_broadcast_oneshot_mask.
> + *
> + * For other callers, e.g. broadcast device replacement,
> + * tick_broadcast_oneshot_mask must not be touched as this would
> + * set bits for CPUs which are already NOHZ, but not idle. Their
> + * next tick_broadcast_enter() would observe the bit set and fail
> + * to update the expiry time and the broadcast event device.
> + */
> + if (from_periodic) {
> cpumask_copy(tmpmask, tick_broadcast_mask);
> + /* Remove the local CPU as it is obviously not idle */
> cpumask_clear_cpu(cpu, tmpmask);
> - cpumask_or(tick_broadcast_oneshot_mask,
> - tick_broadcast_oneshot_mask, tmpmask);
> + cpumask_or(tick_broadcast_oneshot_mask, tick_broadcast_oneshot_mask, tmpmask);
>
> - if (was_periodic && !cpumask_empty(tmpmask)) {
> - ktime_t nextevt = tick_get_next_period();
> + /*
> + * Ensure that the oneshot broadcast handler will wake the
> + * CPUs which are still waiting for periodic broadcast.
> + */
> + nexttick = tick_get_next_period();
> + tick_broadcast_init_next_event(tmpmask, nexttick);
>
> - clockevents_switch_state(bc, CLOCK_EVT_STATE_ONESHOT);
> - tick_broadcast_init_next_event(tmpmask, nextevt);
> - tick_broadcast_set_event(bc, cpu, nextevt);
> - } else
> - bc->next_event = KTIME_MAX;
> - } else {
> /*
> - * The first cpu which switches to oneshot mode sets
> - * the bit for all other cpus which are in the general
> - * (periodic) broadcast mask. So the bit is set and
> - * would prevent the first broadcast enter after this
> - * to program the bc device.
> + * If the underlying broadcast clock event device is
> + * already in oneshot state, then there is nothing to do.
> + * The device was already armed for the next tick
> + * in tick_handle_broadcast_periodic()
> */
> - tick_broadcast_clear_oneshot(cpu);
> + if (clockevent_state_oneshot(bc))
> + return;
> }
> +
> + /*
> + * When switching from periodic to oneshot mode arm the broadcast
> + * device for the next tick.
> + *
> + * If the broadcast device has been replaced in oneshot mode and
> + * the oneshot broadcast mask is not empty, then arm it to expire
> + * immediately in order to reevaluate the next expiring timer.
> + * nexttick is 0 and therefore in the past which will cause the
> + * clockevent code to force an event.
> + *
> + * For both cases the programming can be avoided when the oneshot
> + * broadcast mask is empty.
> + *
> + * tick_broadcast_set_event() implicitly switches the broadcast
> + * device to oneshot state.
> + */
> + if (!cpumask_empty(tick_broadcast_oneshot_mask))
> + tick_broadcast_set_event(bc, cpu, nexttick);

For the case where the other CPUs have already installed their
tick devices and if that function is called with from_periodic=true,
the other CPUs will notice the oneshot change on their next call to
tick_broadcast_enter() thanks to the lock, right? So the tick broadcast
will keep firing until all CPUs have been through idle once and called
tick_broadcast_exit(), right? Because only them can clear themselves
from tick_broadcast_oneshot_mask, am I understanding this correctly?

I'm trying to find the opportunity for a race with dev->next_event
being seen as too far ahead in the future but can't manage so far...

Thanks.