Re: [PATCH v3 1/2] PM / devfreq: Generic CPU frequency to device frequency mapping governor

From: skannan
Date: Thu Aug 02 2018 - 17:00:18 EST


On 2018-08-02 02:56, MyungJoo Ham wrote:
Many CPU architectures have caches that can scale independent of the CPUs.
Frequency scaling of the caches is necessary to make sure the cache is not
a performance bottleneck that leads to poor performance and power. The same
idea applies for RAM/DDR.

To achieve this, this patch adds a generic devfreq governor that takes the
current frequency of each CPU frequency domain and then adjusts the
frequency of the cache (or any devfreq device) based on the frequency of
the CPUs. It listens to CPU frequency transition notifiers to keep itself
up to date on the current CPU frequency.

To decide the frequency of the device, the governor does one of the
following:

This exactly has the same purpose with "passive" governor except for the
single part: passive governor depends on another devfreq driver, not
cpufreq driver.

If both governors have many features in common, can you merge them into one?
Passive governor also has "get_target_freq", which allows driver authors
to define the mapping.

Thanks for the review and pointing me to the passive governor. I agree that at a high level they are both doing the same. I can certainly stuff this CPU freq to dev freq mapping into passive governor, but I think it'll just make one huge set of code that's harder to understand and maintain because it trying to do different things under the hood.

There are also a bunch of complexities and optimizations that come with the cpufreq-map governor that don't fit with the passive governor.

1. It's not one CPU who's frequency we have to listen to. There are multiple CPUs/policies we have to aggregate across.
2. We have to deal with big vs little having different needs/mappings.
3. Since it's always just CPUfreq, I can optimize the handling in the transition notifiers. If I have 4 different devices that are scaled based on CPU freq, I still use only 1 transition notifier. It becomes a bit harder to do with the passive governor.
4. It requires that the device explicitly support the passive governor and pick a mapping function. With cpufreq-map governor, the device drivers don't need to make any changes. Whoever is making a device/board can choose what devices to scale up base on CPU freq based on their board and their needs. Even an end user can say, scale the GPU based on my CPU based on interpolation if they choose to.
5. If a device has some other use for the private data, it can't work with passive governor, but can work with cpufreq-map governor.
6. I also want to improve cpufreq-map governor to handle hotplug correctly in later patches (needs more discussion) and that'll add more complexity.

I think for these reasons we shouldn't combine these two governors. Let me know what you think.

Probably, you will need to add one more notifier call to get events from
cpufreq instead of devfreq.


* Uses a CPU frequency to device frequency mapping table
- Either one mapping table used for all CPU freq policies (typically used
for system with homogeneous cores/clusters that have the same OPPs).
- One mapping table per CPU freq policy (typically used for ASMP systems
with heterogeneous CPUs with different OPPs)

OR

* Scales the device frequency in proportion to the CPU frequency. So, if
the CPUs are running at their max frequency, the device runs at its max
frequency. If the CPUs are running at their min frequency, the device
runs at its min frequency. And interpolated for frequencies in between.

If you want to satisfy these two cases to the "modified" passive governors,
you may need to supply two "preloaded" get_target_freq functions for
struct devfreq_passive_data. If that's impossible, requiring a bit more
modifications, you may need to write alternative routines in
update_devfreq_passive().


Thanks for the pointers. I understood what you mean here.

diff --git a/drivers/devfreq/governor_cpufreq_map.c b/drivers/devfreq/governor_cpufreq_map.c
new file mode 100644
index 0000000..084a3ff
--- /dev/null
+++ b/drivers/devfreq/governor_cpufreq_map.c
@@ -0,0 +1,583 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2014-2015, 2018, The Linux Foundation. All rights reserved.
+ */

Is this boilerplate fine? ( using // )

I was just following what I thought was the new standard. checkpatch even complains about not having this.

Thanks,
Saravana