Re: [PATCH v4 09/18] PM: EM: Introduce runtime modifiable table

From: Rafael J. Wysocki
Date: Tue Sep 26 2023 - 15:12:24 EST


On Mon, Sep 25, 2023 at 10:11 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
>
> The new runtime table would be populated with a new power data to better
> reflect the actual power. The power can vary over time e.g. due to the
> SoC temperature change. Higher temperature can increase power values.
> For longer running scenarios, such as game or camera, when also other
> devices are used (e.g. GPU, ISP) the CPU power can change. The new
> EM framework is able to addresses this issue and change the data
> at runtime safely.
>
> The runtime modifiable EM data is used by the Energy Aware Scheduler (EAS)
> for the task placement. All the other users (thermal, etc.) are still
> using the default (basic) EM. This fact drove the design of this feature.
>
> Signed-off-by: Lukasz Luba <lukasz.luba@xxxxxxx>
> ---
> include/linux/energy_model.h | 4 +++-
> kernel/power/energy_model.c | 12 +++++++++++-
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index 546dee90f716..740e7c25cfff 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -39,7 +39,7 @@ struct em_perf_state {
> /**
> * struct em_perf_table - Performance states table
> * @state: List of performance states, in ascending order
> - * @rcu: RCU used for safe access and destruction
> + * @rcu: RCU used only for runtime modifiable table

This still doesn't appear to be used anywhere, so why change it here?

> */
> struct em_perf_table {
> struct em_perf_state *state;
> @@ -49,6 +49,7 @@ struct em_perf_table {
> /**
> * struct em_perf_domain - Performance domain
> * @default_table: Pointer to the default em_perf_table
> + * @runtime_table: Pointer to the runtime modifiable em_perf_table

"Pointer to em_perf_table that can be dynamically updated"

> * @nr_perf_states: Number of performance states
> * @flags: See "em_perf_domain flags"
> * @cpus: Cpumask covering the CPUs of the domain. It's here
> @@ -64,6 +65,7 @@ struct em_perf_table {
> */
> struct em_perf_domain {
> struct em_perf_table *default_table;
> + struct em_perf_table __rcu *runtime_table;
> int nr_perf_states;
> unsigned long flags;
> unsigned long cpus[];
> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
> index 797141638b29..5b40db38b745 100644
> --- a/kernel/power/energy_model.c
> +++ b/kernel/power/energy_model.c
> @@ -251,6 +251,9 @@ static int em_create_pd(struct device *dev, int nr_states,
> return ret;
> }
>
> + /* Initialize runtime table as default table. */

Redundant comment.

> + rcu_assign_pointer(pd->runtime_table, default_table);
> +
> if (_is_cpu_device(dev))
> for_each_cpu(cpu, cpus) {
> cpu_dev = get_cpu_device(cpu);
> @@ -448,6 +451,7 @@ EXPORT_SYMBOL_GPL(em_dev_register_perf_domain);
> */
> void em_dev_unregister_perf_domain(struct device *dev)
> {
> + struct em_perf_table __rcu *runtime_table;
> struct em_perf_domain *pd;
>
> if (IS_ERR_OR_NULL(dev) || !dev->em_pd)
> @@ -457,18 +461,24 @@ void em_dev_unregister_perf_domain(struct device *dev)
> return;
>
> pd = dev->em_pd;
> -

Unrelated change.

> /*
> * The mutex separates all register/unregister requests and protects
> * from potential clean-up/setup issues in the debugfs directories.
> * The debugfs directory name is the same as device's name.
> */
> mutex_lock(&em_pd_mutex);
> +

Same here.

> em_debug_remove_pd(dev);
>
> + runtime_table = pd->runtime_table;
> +
> + rcu_assign_pointer(pd->runtime_table, NULL);
> + synchronize_rcu();

Is it really a good idea to call this under a mutex?

> +
> kfree(pd->default_table->state);
> kfree(pd->default_table);
> kfree(dev->em_pd);
> +

Unrelated change.

> dev->em_pd = NULL;
> mutex_unlock(&em_pd_mutex);
> }
> --

So this really adds a pointer to a table that can be dynamically
updated to struct em_perf_domain without any users so far. It is not
used anywhere as of this patch AFAICS, which is not what the changelog
is saying.