[tip: sched/urgent] sched/eevdf: Fix avg_vruntime()

From: tip-bot2 for Peter Zijlstra
Date: Tue Oct 03 2023 - 06:42:45 EST


The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: 650cad561cce04b62a8c8e0446b685ef171bc3bb
Gitweb: https://git.kernel.org/tip/650cad561cce04b62a8c8e0446b685ef171bc3bb
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Tue, 26 Sep 2023 14:29:50 +02:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Tue, 03 Oct 2023 12:32:29 +02:00

sched/eevdf: Fix avg_vruntime()

The expectation is that placing a task at avg_vruntime() makes it
eligible. Turns out there is a corner case where this is not the case.

Specifically, avg_vruntime() relies on the fact that integer division
is a flooring function (eg. it discards the remainder). By this
property the value returned is slightly left of the true average.

However! when the average is a negative (relative to min_vruntime) the
effect is flipped and it becomes a ceil, with the result that the
returned value is just right of the average and thus not eligible.

Fixes: af4cf40470c2 ("sched/fair: Add cfs_rq::avg_vruntime")
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/sched/fair.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7d73652..ef7490c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -664,6 +664,10 @@ void avg_vruntime_update(struct cfs_rq *cfs_rq, s64 delta)
cfs_rq->avg_vruntime -= cfs_rq->avg_load * delta;
}

+/*
+ * Specifically: avg_runtime() + 0 must result in entity_eligible() := true
+ * For this to be so, the result of this function must have a left bias.
+ */
u64 avg_vruntime(struct cfs_rq *cfs_rq)
{
struct sched_entity *curr = cfs_rq->curr;
@@ -677,8 +681,12 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq)
load += weight;
}

- if (load)
+ if (load) {
+ /* sign flips effective floor / ceil */
+ if (avg < 0)
+ avg -= (load - 1);
avg = div_s64(avg, load);
+ }

return cfs_rq->min_vruntime + avg;
}