Re: [RFC PATCH v3 2/5] cpuidle: Add Cpufreq Active Stats calls tracking idle entry/exit

From: Lukasz Luba
Date: Tue Apr 26 2022 - 11:01:55 EST


Hi Artem,

Thanks for comments!

On 4/26/22 13:05, Artem Bityutskiy wrote:
Hi Lukasz,

On Wed, 2022-04-06 at 23:08 +0100, Lukasz Luba wrote:
@@ -231,6 +232,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct
cpuidle_driver *drv,
        trace_cpu_idle(index, dev->cpu);
        time_start = ns_to_ktime(local_clock());
+       cpufreq_active_stats_cpu_idle_enter(time_start);
+
        stop_critical_timings();
        if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))
                rcu_idle_enter();
@@ -243,6 +246,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct
cpuidle_driver *drv,
        time_end = ns_to_ktime(local_clock());
        trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
+       cpufreq_active_stats_cpu_idle_exit(time_end);
+

At this point the interrupts are still disabled, and they get enabled later. So
the more code you add here and the longer it executes, the longer you delay the
interrupts. Therefore, you are effectively increasing IRQ latency from idle by
adding more code here.

Good point, I've added it just below the trace_cpu_idle().


How much? I do not know, depends on how much code you need to execute. But the
amount of code in functions like this tends to increase over time.

So the risk is that we'll keep making 'cpufreq_active_stats_cpu_idle_exit()',
and (may be unintentionally) increase idle interrupt latency.

This is not ideal.

I agree, I will try to find a better place to put this call.


We use the 'wult' tool (https://github.com/intel/wult) to measure C-states
latency and interrupt latency on Intel platforms, and for fast C-states like
Intel C1, we can see that even the current code between C-state exit and
interrupt re-enabled adds measurable overhead.

Thanks for the hint and the link. I'll check that tool. I don't know if
that would work with my platforms. I might create some tests for this
latency measurements.


I am worried about adding more stuff here.

Please, consider getting the stats after interrupts are re-enabled. You may lose
some "precision" because of that, but it is probably overall better that adding
to idle interrupt latency.

Definitely. I don't need such precision, so later when interrupts are
re-enabled is OK for me.


        /* The cpu is no longer idle or about to enter idle. */
        sched_idle_set_state(NULL);



This new call might be empty for your x86 kernels, since probably
you set the CONFIG_CPU_FREQ_STAT.I can add additional config
so platforms might still have CONFIG_CPU_FREQ_STAT but avoid this
new feature and additional overhead in idle exit when e.g.
CONFIG_CPU_FREQ_ACTIVE_STAT is not set.

The x86 platforms won't use IPA governor, so it's reasonable to
do this way.

Does this sounds good?

Regards,
Lukasz