Re: cpufreq: intel_pstate: map utilization into the pstate range

From: Rafael J. Wysocki
Date: Tue Jan 04 2022 - 14:22:46 EST

Next message: Sunil Muthuswamy: "RE: [PATCH v7 2/2] PCI: hv: Add arm64 Hyper-V vPCI support"
Previous message: Nathan Chancellor: "Re: [PATCH] [v3] x86/sgx: Fix NULL pointer dereference on non-SGX systems"
In reply to: Julia Lawall: "Re: cpufreq: intel_pstate: map utilization into the pstate range"
Next in thread: Julia Lawall: "Re: cpufreq: intel_pstate: map utilization into the pstate range"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Jan 4, 2022 at 4:49 PM Julia Lawall <julia.lawall@xxxxxxxx> wrote:
>
> I tried the whole experiment again on an Intel w2155 (one socket, 10
> physical cores, pstates 12, 33, and 45).
>
> For the CPU there is a small jump a between 32 and 33 - less than for the
> 6130.
>
> For the RAM, there is a big jump between 21 and 22.
>
> Combining them leaves a big jump between 21 and 22.

These jumps are most likely related to voltage increases.

> It seems that the definition of efficient is that there is no more cost
> for the computation than the cost of simply having the machine doing any
> computation at all. It doesn't take into account the time and energy
> required to do some actual amount of work.

Well, that's not what I wanted to say.

Of course, the configuration that requires less energy to be spent to
do a given amount of work is more energy-efficient. To measure this,
the system needs to be given exactly the same amount of work for each
run and the energy spent by it during each run needs to be compared.

However, I think that you are interested in answering a different
question: Given a specific amount of time (say T) to run the workload,
what frequency to run the CPUs doing the work at in order to get the
maximum amount of work done per unit of energy spent by the system (as
a whole)? Or, given 2 different frequency levels, which of them to
run the CPUs at to get more work done per energy unit?

The work / energy ratio can be estimated as

W / E = C * f / P(f)

where C is a constant and P(f) is the power drawn by the whole system
while the CPUs doing the work are running at frequency f, and of
course for the system discussed previously it is greater in the 2 GHz
case.

However P(f) can be divided into two parts, P_1(f) that really depends
on the frequency and P_0 that does not depend on it. If P_0 is large
enough to dominate P(f), which is the case in the 10-20 range of
P-states on the system in question, it is better to run the CPUs doing
the work faster (as long as there is always enough work to do for
them; see below). This doesn't mean that P(f) is not a convex
function of f, though.

Moreover, this assumes that there will always be enough work for the
system to do when running the busy CPUs at 2 GHz, or that it can go
completely idle when it doesn't do any work, but let's see what
happens if the amount of work to do is W_1 = C * 1 GHz * T and the
system cannot go completely idle when the work is done.

Then, nothing changes for the busy CPUs running at 1 GHz, but in the 2
GHz case we get W = W_1 and E = P(2 GHz) * T/2 + P_0 * T/2, because
the busy CPUs are only busy 1/2 of the time, but power P_0 is drawn by
the system regardless. Hence, in the 2 GHz case (assuming P(2 GHz) =
120 W and P_0 = 90 W), we get

W / E = 2 * C * 1 GHz / (P(2 GHz) + P_0) = 0.0095 * C * 1 GHz

which is slightly less than the W / E ratio at 1 GHz approximately
equal to 0.01 * C * 1 GHz (assuming P(1 GHz) = 100 W), so in these
conditions it would be better to run the busy CPUs at 1 GHz.

Next message: Sunil Muthuswamy: "RE: [PATCH v7 2/2] PCI: hv: Add arm64 Hyper-V vPCI support"
Previous message: Nathan Chancellor: "Re: [PATCH] [v3] x86/sgx: Fix NULL pointer dereference on non-SGX systems"
In reply to: Julia Lawall: "Re: cpufreq: intel_pstate: map utilization into the pstate range"
Next in thread: Julia Lawall: "Re: cpufreq: intel_pstate: map utilization into the pstate range"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]