Re: [PATCH v4 2/4] cpuidle:powernv: Add helper function to populate powernv idle states.

From: Gautham R Shenoy
Date: Wed Dec 14 2016 - 04:56:39 EST


Hi Balbir,

On Tue, Dec 13, 2016 at 10:51:04PM +1100, Balbir Singh wrote:
>
>
> On 10/12/16 00:32, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx>
> >
> > In the current code for powernv_add_idle_states, there is a lot of code
> > duplication while initializing an idle state in powernv_states table.
> >
> > Add an inline helper function to populate the powernv_states[] table for
> > a given idle state. Invoke this for populating the "Nap", "Fastsleep"
> > and the stop states in powernv_add_idle_states.
> >
> > Signed-off-by: Gautham R. Shenoy <ego@xxxxxxxxxxxxxxxxxx>
> > ---
> > drivers/cpuidle/cpuidle-powernv.c | 85 ++++++++++++++++++++++-----------------
> > include/linux/cpuidle.h | 1 +
> > 2 files changed, 50 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> > index 7fe442c..db18af1 100644
> > --- a/drivers/cpuidle/cpuidle-powernv.c
> > +++ b/drivers/cpuidle/cpuidle-powernv.c
> > @@ -167,6 +167,24 @@ static int powernv_cpuidle_driver_init(void)
> > return 0;
> > }
> >
> > +static inline void add_powernv_state(int index, const char *name,
> > + unsigned int flags,
> > + int (*idle_fn)(struct cpuidle_device *,
> > + struct cpuidle_driver *,
> > + int),
> > + unsigned int target_residency,
> > + unsigned int exit_latency,
> > + u64 psscr_val)
> > +{
> > + strlcpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
> > + strlcpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
>
> Do name and desc ever diverge?

On some other architectures, like kirkwood (see
drivers/cpuidle/cpuidle-kirkwood.c) they do. "desc" field is used to
provide a more descriptive information regarding the idle state.

On POWER, the names were self-explanatory. So, we have desc same as
the name.

>
> > + powernv_states[index].flags = flags;
> > + powernv_states[index].target_residency = target_residency;
> > + powernv_states[index].exit_latency = exit_latency;
> > + powernv_states[index].enter = idle_fn;
>
> Why not call it idle_fn instead of enter?

"enter" is a field name in the generic cpuidle_state structure and
powernv_states[] is an instance of that structure.

>
> > + stop_psscr_table[index] = psscr_val;
> > +}
> > +
> > static int powernv_add_idle_states(void)
> > {
> > struct device_node *power_mgt;
> > @@ -236,6 +254,7 @@ static int powernv_add_idle_states(void)
> > "ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
> >
> > for (i = 0; i < dt_idle_states; i++) {
> > + unsigned int exit_latency, target_residency;
> > /*
> > * If an idle state has exit latency beyond
> > * POWERNV_THRESHOLD_LATENCY_NS then don't use it
> > @@ -243,28 +262,33 @@ static int powernv_add_idle_states(void)
> > */
> > if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS)
>
> Ideally this should be called POWERNV_MAX_THRESHOLD_LATENCY_NS then

Yes, it can be called that. But then again, we're only interested in
the upper threshold in this code. I will add a comment near the macro
definition.

>
> > continue;
> > + /*
> > + * Firmware passes residency and latency values in ns.
> > + * cpuidle expects it in us.
> > + */
> > + exit_latency = ((unsigned int)latency_ns[i]) / 1000;
> > + if (!rc)
> > + target_residency = residency_ns[i] / 1000;
> > + else
> > + target_residency = 0;
>
> Where do we get rc from? what does target_residency = 0 mean?

The rc value comes from the
of_property_read_u32_array(power_mgt,
"ibm,cpu-idle-state-residency-ns", residency_ns,
dt_idle_states);

just before the for-loop. This tells us whether the firmware has
populated the residency information for the idle state or not.

rc != 0 indicates that the firmware has not populated the value.

Since the governor will pick the first idle state whose
target_residency matches the predicted residency, setting
target_residency = 0 implies that if any stop state is selected at
all, it is the earliest state.


> Balbir Singh
>