On Mon, Sep 25, 2023 at 10:11 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
First off, I would merge this with the previous patch, as the changes
would be much clearer then IMO.
Add an interface which allows to modify EM power data at runtime.
The new power information is populated by the provided callback, which
is called for each performance state.
But it all starts with copying the frequencies from the default table.
The CPU frequencies' efficiency is
re-calculated since that might be affected as well. The old EM memory
is going to be freed later using RCU mechanism.
Not all of it, but the old runtime table that is not going to be used any more.
Signed-off-by: Lukasz Luba <lukasz.luba@xxxxxxx>
+/**
+ * em_dev_update_perf_domain() - Update runtime EM table for a device
+ * @dev : Device for which the EM is to be updated
+ * @cb : Callback function providing the power data for the EM
+ * @priv : Pointer to private data useful for passing context
+ * which might be required while calling @cb
It is still unclear to me who is going to use this priv pointer and how.
+ *
+ * Update EM runtime modifiable table for a @dev using the callback
+ * defined in @cb. The EM new power values are then used for calculating
+ * the em_perf_state::cost for associated performance state.
It actually allocates a new runtime table and populates it from
scratch, using the frequencies from the default table and the
callback.
+ *
+ * This function uses mutex to serialize writers, so it must not be called
"a mutex"
+ * from non-sleeping context.
+
+ if (!dev || !dev->em_pd) {
Checking dev against NULL under the mutex is pointless (either it is
NULL or it isn't, so check it earlier).
+ ret = -EINVAL;
+ goto unlock_em;
+ }
+
+ pd = dev->em_pd;
And I would check pd against NULL here.
+
+ runtime_table = kzalloc(sizeof(*runtime_table), GFP_KERNEL);
+ if (!runtime_table) {
+ ret = -ENOMEM;
+ goto unlock_em;
+ }
+
+ runtime_table->state = kcalloc(pd->nr_perf_states,
+ sizeof(struct em_perf_state),
+ GFP_KERNEL);
+ if (!runtime_table->state) {
+ ret = -ENOMEM;
+ goto free_runtime_table;
+ }
The above allocations can be merged into one and allocating memory
under the mutex is questionable.
@@ -501,9 +598,23 @@ void em_dev_unregister_perf_domain(struct device *dev)
runtime_table = pd->runtime_table;
+ /*
+ * Safely destroy runtime modifiable EM. By using the call
+ * synchronize_rcu() we make sure we don't progress till last user
+ * finished the RCU section and our update got applied.
+ */
rcu_assign_pointer(pd->runtime_table, NULL);
synchronize_rcu();
+ /*
+ * After the sync no updates will be in-flight, so free the
+ * memory allocated for runtime table (if there was such).
+ */
+ if (runtime_table != pd->default_table) {
+ kfree(runtime_table->state);
+ kfree(runtime_table);
+ }
Can't this race with the RCU callback freeing the runtime table?