[PATCH updated v2] sched/fair: core wide cfs task priority comparison

From: Aaron Lu
Date: Fri Apr 24 2020 - 10:24:56 EST


On Tue, Apr 21, 2020 at 10:51:31AM +0800, Aaron Lu wrote:
> On Mon, Apr 20, 2020 at 06:26:34PM -0400, Vineeth Remanan Pillai wrote:
> > On Mon, Apr 20, 2020 at 4:08 AM Aaron Lu <aaron.lwe@xxxxxxxxx> wrote:
> > >
> > > On Fri, Apr 17, 2020 at 05:40:45PM +0800, Aaron Lu wrote:
> >
> > > The adjust is only needed when core scheduling is enabled while I
> > > mistakenly called it on both enable and disable. And I come to think
> > > normalize is a better name than adjust.
> > >
> > I guess we would also need to update the min_vruntime of the sibling
> > to match the rq->core->min_vruntime on coresched disable. Otherwise
> > a new enqueue on root cfs of the sibling would inherit the very old
> > min_vruntime before coresched enable and thus would starve all the
> > already queued tasks until the newly enqueued se's vruntime catches up.
>
> Yes this is a concern but AFAICS, there is no problem. Consider:
> - when there is no queued task across the disable boundary, the stale
> min_vruntime doesn't matter as you said;
> - when there are queued tasks across the disable boundary, the newly
> queued task will normalize its vruntime against the sibling_cfs_rq's
> min_vruntime, if the min_vruntime is stale and problem would occur.
> But my reading of the code made me think this min_vruntime should
> have already been updated by update_curr() in enqueue_entity() before
> being used by this newly enqueued task and update_curr() would bring
> the stale min_vruntime to the smallest vruntime of the queued ones so
> again, no problem should occur.

After discussion with Vineeth, I now tend to add the syncing of
sibling_cfs_rq min_vruntime on core disable because analysing all the
code is time consuming and though I didn't find any problems now, I
might miss something and future code change may also break the
expectations so adding it seems a safe thing to do, also, it didn't
bring any performance downgrade as it is a one time disable stuff.

Vineeth also pointed out a problem of misusing cfs_rq->min_vruntime for
!CONFIG_64BIT kernel in migrate_task_rq_fair(), this is also fixed.

(only compile tested for !CONFIG_64BIT kernel)