Re: [RFC PATCH v3 00/16] Core scheduling v3

From: Aaron Lu
Date: Fri May 31 2019 - 02:13:10 EST


On 2019/5/31 13:12, Aubrey Li wrote:
> On Fri, May 31, 2019 at 11:01 AM Aaron Lu <aaron.lu@xxxxxxxxxxxxxxxxx> wrote:
>>
>> This feels like "date" failed to schedule on some CPU
>> on time.
>>
>> My first reaction is: when shell wakes up from sleep, it will
>> fork date. If the script is untagged and those workloads are
>> tagged and all available cores are already running workload
>> threads, the forked date can lose to the running workload
>> threads due to __prio_less() can't properly do vruntime comparison
>> for tasks on different CPUs. So those idle siblings can't run
>> date and are idled instead. See my previous post on this:
>> https://lore.kernel.org/lkml/20190429033620.GA128241@aaronlu/
>> (Now that I re-read my post, I see that I didn't make it clear
>> that se_bash and se_hog are assigned different tags(e.g. hog is
>> tagged and bash is untagged).
>
> Yes, script is untagged. This looks like exactly the problem in you
> previous post. I didn't follow that, does that discussion lead to a solution?

No immediate solution yet.

>>
>> Siblings being forced idle is expected due to the nature of core
>> scheduling, but when two tasks belonging to two siblings are
>> fighting for schedule, we should let the higher priority one win.
>>
>> It used to work on v2 is probably due to we mistakenly
>> allow different tagged tasks to schedule on the same core at
>> the same time, but that is fixed in v3.
>
> I have 64 threads running on a 104-CPU server, that is, when the

104-CPU means 52 cores I guess.
64 threads may(should?) spread on all the 52 cores and that is enough
to make 'date' suffer.

> system has ~40% idle time, and "date" is still failed to be picked
> up onto CPU on time. This may be the nature of core scheduling,
> but it seems to be far from fairness.

Exactly.

> Shouldn't we share the core between (sysbench+gemmbench)
> and (date)? I mean core level sharing instead of "date" starvation?

We need to make core scheduling fair, but due to no
immediate solution to vruntime comparison cross CPUs, it's not
done yet.