Re: [PATCH v2] tick/broadcast: Do not set oneshot_mask except was_periodic was true

From: Thomas Gleixner
Date: Mon Apr 24 2023 - 14:28:37 EST


On Sun, Apr 23 2023 at 22:16, Victor Hassan wrote:
> On 4/16/2023 5:01 AM, Thomas Gleixner wrote:
>> After more analysis of that code it turns out that this is even more
>> broken because of this:
>>
>> CPU0 CPU1
>>
>> idle()
>> tick_broadcast_enter()
>> test_and_set_cpu(cpu, oneshot_mask);
>> shutdown_cpu_local_device();
>> tick_broadcast_set_event();
>> sleep_deep();
>>
>> // All good. Broadcast will wake the CPU up
>>
>> install_new_broadcast_device(newdev)
>> tick_broadcast_setup_oneshot(newdev)
>> if (was_periodic) <- Path not taken because device is in shutdown state
>
> Are you saying that the "tick_broadcast_enter->broadcast_shutdown_local"
> path will turn off the cpu1 tick device(as the broadcast)?

No. On CPU1 the idle path does:

- Mark the CPU in the broadcast oneshot mask
- Conditionally shut down the CPU local clock event device
- Set the broadcast event

> I think this only happens when CPU1's tick device is used as the
> broadcast device. However, the "broadcast_needs_cpu" path prevents this
> from happening, right?

No. broadcast_shutdown_local() checks first whether

- the broadcast device is hrtimer based, i.e. a software emulation
- the broadcast device is armed
- the CPU handling the hrtimer is the current CPU

If all apply then the CPU local device cannot be shut down.

Then it checks whether:

- the broadcast device is hrtimer based, i.e. a software emulation
- the CPU local event is before the broadcast event, as it cannot
reprogram the remote CPUs clockevent device

If all apply then the CPU local device cannot be shut down.

> Nevertheless, there is still an issue here. At this point, the broadcast
> will be in oneshot state (was_periodic is still false). The reason why
> this has not caused any serious problems may be because other CPUs will
> quickly enter idle to help refresh the broadcast.

The installation of a new broadcast device is a rare event and yes, if
the CPU which installed it or come other CPU goes idle shortly after it
will arm the broadcast event and stuff keeps moving, but that's far from
correct.

Thanks,

tglx