Re: [RFC PATCH 11/11] sched: Add comments to find_busiest_group()function.

From: Gautham R Shenoy
Date: Wed Mar 25 2009 - 07:43:39 EST


On Wed, Mar 25, 2009 at 02:44:27PM +0530, Gautham R Shenoy wrote:
> Add /** style comments around find_busiest_group(). Also add a few explanatory
> */

<snip>

> static struct sched_group *
> find_busiest_group(struct sched_domain *sd, int this_cpu,
> @@ -3593,17 +3613,31 @@ find_busiest_group(struct sched_domain *sd, int this_cpu,
> update_sd_lb_stats(sd, this_cpu, idle, sd_idle, cpus,
> balance, &sds);
>
> + /* Cases where imbalance does not exist from POV of this_cpu */
> + /* 1) this_cpu is not the appropriate cpu to perform load balancing
> + * at this level.
> + * 2) There is no busy sibling group to pull from.
> + * 3) This group is the busiest group.
> + * 4) This group is more busy than the avg busieness at this
> + * sched_domain.
> + * 5) The imbalance is within the specified limit.
> + * 6) Any rebalance would lead to ping-pong
> + */
> if (balance && !(*balance))
> goto ret;
>
> - if (!sds.busiest || sds.this_load >= sds.max_load
> - || sds.busiest_nr_running == 0)
> + if (!sds.busiest || sd.busiest_nr_running == 0)
^^^^^^^^^^^^^^^^^^^^
should have been sds.busiest_nr_running. Hence the build failure on tip.

I think I missed compile testing this last patch.

Ingo, could you revert commit 7b6340ef884aff69a54f8a530c73ad9da0a7c388 in
tip/balancing and commit the following patch instead?

--->
sched: Add comments to find_busiest_group() function.

From: Gautham R Shenoy <ego@xxxxxxxxxx>

Add /** style comments around find_busiest_group(). Also add a few explanatory
comments.

This concludes the find_busiest_group() cleanup. The function is down to 72
lines from the original 313 lines.

Signed-off-by: Gautham R Shenoy <ego@xxxxxxxxxx>
---

kernel/sched.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
1 files changed, 42 insertions(+), 8 deletions(-)


diff --git a/kernel/sched.c b/kernel/sched.c
index 6404ddf..a48cf9d 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3572,10 +3572,30 @@ static inline void calculate_imbalance(struct sd_lb_stats *sds, int this_cpu,
}
/******* find_busiest_group() helpers end here *********************/

-/*
- * find_busiest_group finds and returns the busiest CPU group within the
- * domain. It calculates and returns the amount of weighted load which
- * should be moved to restore balance via the imbalance parameter.
+/**
+ * find_busiest_group - Returns the busiest group within the sched_domain
+ * if there is an imbalance. If there isn't an imbalance, and
+ * the user has opted for power-savings, it returns a group whose
+ * CPUs can be put to idle by rebalancing those tasks elsewhere, if
+ * such a group exists.
+ *
+ * Also calculates the amount of weighted load which should be moved
+ * to restore balance.
+ *
+ * @sd: The sched_domain whose busiest group is to be returned.
+ * @this_cpu: The cpu for which load balancing is currently being performed.
+ * @imbalance: Variable which stores amount of weighted load which should
+ * be moved to restore balance/put a group to idle.
+ * @idle: The idle status of this_cpu.
+ * @sd_idle: The idleness of sd
+ * @cpus: The set of CPUs under consideration for load-balancing.
+ * @balance: Pointer to a variable indicating if this_cpu
+ * is the appropriate cpu to perform load balancing at this_level.
+ *
+ * Returns: - the busiest group if imbalance exists.
+ * - If no imbalance and user has opted for power-savings balance,
+ * return the least loaded group whose CPUs can be
+ * put to idle by rebalancing its tasks onto our group.
*/
static struct sched_group *
find_busiest_group(struct sched_domain *sd, int this_cpu,
@@ -3593,17 +3613,31 @@ find_busiest_group(struct sched_domain *sd, int this_cpu,
update_sd_lb_stats(sd, this_cpu, idle, sd_idle, cpus,
balance, &sds);

+ /* Cases where imbalance does not exist from POV of this_cpu */
+ /* 1) this_cpu is not the appropriate cpu to perform load balancing
+ * at this level.
+ * 2) There is no busy sibling group to pull from.
+ * 3) This group is the busiest group.
+ * 4) This group is more busy than the avg busieness at this
+ * sched_domain.
+ * 5) The imbalance is within the specified limit.
+ * 6) Any rebalance would lead to ping-pong
+ */
if (balance && !(*balance))
goto ret;

- if (!sds.busiest || sds.this_load >= sds.max_load
- || sds.busiest_nr_running == 0)
+ if (!sds.busiest || sds.busiest_nr_running == 0)
+ goto out_balanced;
+
+ if (sds.this_load >= sds.max_load)
goto out_balanced;

sds.avg_load = (SCHED_LOAD_SCALE * sds.total_load) / sds.total_pwr;

- if (sds.this_load >= sds.avg_load ||
- 100*sds.max_load <= sd->imbalance_pct * sds.this_load)
+ if (sds.this_load >= sds.avg_load)
+ goto out_balanced;
+
+ if (100 * sds.max_load <= sd->imbalance_pct * sds.this_load)
goto out_balanced;

sds.busiest_load_per_task /= sds.busiest_nr_running;

--
Thanks and Regards
gautham
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/