Re: [PATCH v8 1/9] sched/fair: fix unfairness at wakeup

From: Vincent Guittot
Date: Thu Nov 17 2022 - 04:18:58 EST


On Wed, 16 Nov 2022 at 09:26, Aaron Lu <aaron.lu@xxxxxxxxx> wrote:
>
> On Mon, Nov 14, 2022 at 12:05:18PM +0100, Vincent Guittot wrote:
> > On Mon, 14 Nov 2022 at 04:06, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Hi Vincent,
> > >
> > > On Thu, Nov 10, 2022 at 06:50:01PM +0100, Vincent Guittot wrote:
>
> ... ...
>
> > > > +static inline unsigned long get_latency_max(void)
> > > > +{
> > > > + unsigned long thresh = get_sched_latency(false);
> > > > +
> > > > + thresh -= sysctl_sched_min_granularity;
> > >
> > > Could you clarify, why are you subtracting sched_min_granularity here? Could
> > > you add some comments here to make it clear?
> >
> > If the waking task failed to preempt current it could to wait up to
> > sysctl_sched_min_granularity before preempting it during next tick.
>
> check_preempt_tick() compares vdiff/delta between the leftmost se and
> curr against curr's ideal_runtime, it doesn't use thresh here or the
> adjusted wakeup_gran, so I don't see why reducing thresh here can help
> se to preempt curr during next tick if it failed to preempt curr in its
> wakeup path.

If waking task doesn't preempt curr, it will wait for the next
check_preempt_tick(), but check_preempt_tick() ensures a minimum
runtime of sysctl_sched_min_granularity before comparing the vruntime.
Thresh doesn't help in check_preempt_tick() but anticipate the fact
that if it fails to preempt now, current can get an additional
sysctl_sched_min_granularity runtime before being preempted.

>
> I can see reducing thresh here with whatever value can help the waking
> se to preempt curr in wakeup_preempt_entity() though, because most
> likely the waking se's vruntime is cfs_rq->min_vruntime -
> sysctl_sched_latency/2 and curr->vruntime is near cfs_rq->min_vruntime
> so vdiff is about sysctl_sched_latency/2, which is the same value as
> get_sched_latency(false) and when thresh is reduced some bit, then vdiff
> in wakeup_preempt_entity() will be larger than gran and make it possible
> to preempt.
>
> So I'm confused by your comment or I might misread the code.