Re: [PATCH v3] sched/topology: remove sysctl_sched_energy_aware depending on the architecture

From: Shrikanth Hegde
Date: Wed Sep 20 2023 - 13:48:20 EST




On 9/15/23 5:30 PM, Valentin Schneider wrote:
> On 14/09/23 23:26, Shrikanth Hegde wrote:
>> On 9/14/23 9:51 PM, Valentin Schneider wrote:
>>> On 13/09/23 17:18, Shrikanth Hegde wrote:
>>>> sysctl_sched_energy_aware is available for the admin to disable/enable
>>>> energy aware scheduling(EAS). EAS is enabled only if few conditions are
>>>> met by the platform. They are, asymmetric CPU capacity, no SMT,
>>>> valid cpufreq policy, frequency invariant load tracking. It is possible
>>>> platform when booting may not have EAS capability, but can do that after.
>>>> For example, changing/registering the cpufreq policy.
>>>>
>>>> At present, though platform doesn't support EAS, this sysctl is still
>>>> present and it ends up calling rebuild of sched domain on write to 1 and
>>>> NOP when writing to 0. That is confusing and un-necessary.
>>>>
>>>
>>
>> Hi Valentin, Thanks for taking a look at this patch.
>>
>>> But why would you write to it in the first place? Or do you mean to use
>>> this as an indicator for userspace that EAS is supported?
>>>
>>
>> Since this sysctl is present and its value being 1, it gives the
>> impression to the user that EAS is supported when it is not.
>> So its an attempt to correct that part.
>>
>
> Ah, I see. Then how about just making the sysctl return 0 when EAS isn't
> supported? And on top of it, prevent all writes when EAS isn't supported
> (perf domains cannot be built, so there would be no point in forcing a
> rebuild that will do nothing).
>
> I can never remember how to properly use the sysctl API, so that's a very
> crude implementation, but something like so?
>
> ---
>

I tried the below method, instead of sched_energy_enabled, using a helper
function which does similar checks as in build_perf_domains and use the
same helper function in build_perf_domains as well.

# cat sched_energy_aware
# echo 0 > sched_energy_aware
-bash: echo: write error: Operation not supported
# echo 1 > sched_energy_aware
-bash: echo: write error: Operation not supported
#



> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 05a5bc678c089..dadfc5afc4121 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -230,9 +230,28 @@ static int sched_energy_aware_handler(struct ctl_table *table, int write,
> if (write && !capable(CAP_SYS_ADMIN))
> return -EPERM;
>
> + if (!sched_energy_enabled()) {
> + if (write)
> + return -EOPNOTSUPP;
> + else {
> + size_t len;
> +
> + if (*ppos) {
> + *lenp = 0;
> + return 0;

Is it possible for *ppos to be 0? if so in which scenario?
and
Does it make sense to make length as 0 unconditionally if eas
is not possible?

> + }
> +
> + len = snprintf((char *)buffer, 3, "0\n");
> +
> + *lenp = len;
> + *ppos += len;
> + return 0;
> + }
> +> ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);