Re: [PATCH 4/4] sched: cpufreq_cfs: pelt-based cpu frequency scaling

From: Juri Lelli
Date: Tue May 05 2015 - 08:16:05 EST


Hi Peter,

thanks a lot for the fast reply! :)

On 05/05/15 10:00, Peter Zijlstra wrote:
> On Mon, May 04, 2015 at 03:10:41PM -0700, Michael Turquette wrote:
>> This policy is implemented using the cpufreq governor interface for two
>> main reasons:
>>
>> 1) re-using the cpufreq machine drivers without using the governor
>> interface is hard.
>>
>> 2) using the cpufreq interface allows us to switch between the
>> scheduler-driven policy and legacy cpufreq governors such as ondemand at
>> run-time. This is very useful for comparative testing and tuning.
>
> Urgh,. so I don't really like that. It adds a lot of noise to the
> system. You do the irq work thing to kick the cpufreq threads which do
> their little thing -- and their wakeup will influence the cfs
> accounting, which in turn will start the whole thing anew.
>

Right, we introduce some overhead, but in the end it should be less or
at least similar to what we already have today with ondemand, for
example. The idea here is that we should trigger this kthread wrapper
only when it is really needed, and maybe reduce the things it needs to
do. The irq work thing is one way we could do it from wherever we want
to.

Regarding cfs accounting, the bad trick is that we run these kthreads
with fifo. One reason is that we expect them to run quickly and we
usually want them to run as soon as they are woken up (possibly
preempting the task for which they are adapting the frequency). Of
course we'll have the same accounting problem within RT, but maybe we
could associate some special flag to them and treat them differently.
Or else we could just realize that we need this kind of small wrapper
tasks, of which we should know the behaviour of, and live with that.

Anyway, I'm currently experimenting in driving this thing a bit
differently from what we have in this patchset. I'm trying to reduce
the need to trigger the whole machinery to the least. Do you think is
still valuable to give it a look?

> I would really prefer you did a whole new system with directly invoked
> drivers that avoid the silly dance. Your 'new' ARM systems should be
> well capable of that.
>

Right, this thing is maybe not the cleanest solution we could come up
with, and how 'new' ARM systems will work may help us designing a better
one, but I'm not sure of what we can really do about today systems,
though. We of course need to support them (and for few years I guess)
and we would also like to have an event-driven solution to drive OPP
selection from the scheduler at the same time.

>From what I can tell, this "non-cpufreq" new system will probably have
to re-implement what current drivers are doing, and it will still have
to sleep during freq changes (at least for ARM). This will require some
asynch way of doing the freq changes, which is what the kthread solution
is already doing.

Best,

- Juri

> You can still do 2 if you create a cpufreq off switch. You can then
> either enable the sched one or the legacy cpufreq -- or both if you want
> a trainwreck ;-)
>
> As to the drivers, they're mostly fairly small and self contained, it
> should not be too hard to hack them up to work without cpufreq.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/