Re: [PATCH 3/3] sched: terminate newidle balancing onceatleastone task has moved over

From: Gregory Haskins
Date: Tue Jun 24 2008 - 12:55:30 EST


>>> On Tue, Jun 24, 2008 at 9:31 AM, in message <1214314273.4351.34.camel@twins>,
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, 2008-06-24 at 07:18 -0600, Gregory Haskins wrote:
>> >>> On Tue, Jun 24, 2008 at 6:13 AM, in message
> <1214302406.4351.23.camel@twins>,
>> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Mon, 2008-06-23 at 17:04 -0600, Gregory Haskins wrote:
>> >> Inspired by Peter Zijlstra.
>> >>
>> >> Signed-off-by: Gregory Haskins <ghaskins@xxxxxxxxxx>
>> >> ---
>> >>
>> >> kernel/sched.c | 4 ++++
>> >> 1 files changed, 4 insertions(+), 0 deletions(-)
>> >>
>> >> diff --git a/kernel/sched.c b/kernel/sched.c
>> >> index 3efbbc5..c8e8520 100644
>> >> --- a/kernel/sched.c
>> >> +++ b/kernel/sched.c
>> >> @@ -2775,6 +2775,10 @@ static int move_tasks(struct rq *this_rq, int
>> > this_cpu, struct rq *busiest,
>> >> max_load_move - total_load_moved,
>> >> sd, idle, all_pinned, &this_best_prio);
>> >> class = class->next;
>> >> +
>> >> + if (idle == CPU_NEWLY_IDLE && this_rq->nr_running)
>> >> + break;
>> >> +
>> >> } while (class && max_load_move > total_load_moved);
>> >>
>> >> return total_load_moved > 0;
>> >
>> >
>> > right,.. uhm, except that you forgot all the other fixes and
>> > generalizations I had,..
>>
>> Heh...well I intentionally simplified it, but perhaps that is out of
> ignorance. I did say "inspired by" ;)
>>
>> >
>> > The LB_START/LB_COMPLETE stuff is needed to fix CFS load balancing. It
>> > now always iterates the first sysctl_sched_nr_migrate tasks, and if it
>> > doesn't find any there, just gives up - which isn't too big of a problem
>> > with it set to 32, but if you drop it to 2/4 stuff starts valing apart.
>> >
>> > And the break I had here, only checks classes above and equal to the
>> > current class.
>> >
>> > This again is needed when you have more classes.
>>
>> Im not sure I understand/agree here (unless you plan on having a class below
> sched_idle()??)
>>
>> The fact that we are going NEWLYIDLE to me implies that all the other
> classes are
>> "above or equal". And rq->nr_running approximates all the per-class vtable
> work
>> that you had done to probe the higher classes. We currently only hit this
> code when
>> rq->nr_running == 0, so rq->nr_running !=0 seems like a logical termination
>> condition.
>>
>> I guess what I am not clear on is: "when would we be NEWLYIDLE in a higher
> class,
>> yet have tasks populated in lower classes such at nr_running is non-zero".
>> Additionally, even if we have that condition (e.g. with something like the
> EDF work you
>> are doing, perhaps?), shouldn't we patch the advanced form of this logic
> when the rest
>> of the code goes in? For now, this seems like the most straight forward way
> to
>> accomplish the goal. But I could be missing something ;)
>
> The thing I'm worried about - but it might be unfounded and is certainly
> so now - is that suppose we have:
>
> EDF
> FIFO/RR
> SOFTRT
> OTHER
> IDLE
>
> and we've just done FIFO/RR (which is a nop) and and some interrupt woke
> an OTHER task while we dropped for lockbreak.
>
> At this point your logic would bail out and start running the OTHER
> task, even though we might have found a SOFTRQ task to run had we
> bothered to look.
>

Ok, now I think I understand your concern. But I think you may be worrying about
this at the wrong level. I would think we should be doing something similar to the
post-balance patch I submitted a while back. It basically iterated through each class,
giving each an opportunity to pull tasks over in its own way. The difference there
was that I was doing it post-schedule to deal with that locking issue. We could
take the same idea and do it where we pre_schedule() today.

I think the f_b_g() et. al. is really SCHED_OTHER specific, and probably always will be.
Lets just formalize that. Perhaps we should move all the LB code to sched_fair and set
something like what I proposed up. Thoughts?

-Greg



>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/