Re: [PATCH 1/4] sched/eevdf: Fix vruntime adjustment on reweight

From: Tianchen Ding
Date: Fri Mar 01 2024 - 05:08:06 EST


On 2024/3/1 16:30, Abel Wu wrote:
On 3/1/24 2:41 PM, Tianchen Ding Wrote:
On 2024/2/29 22:25, Abel Wu wrote:
Good catch. And to the best of my knowledge, the answer is YES. The
above Equation in the paper, which is Eq. (20), is based on the
assumption that:

     "once client 3 leaves, the remaining two clients will
      proportionally support the eventual loss or gain in the
      service time"  -- Page 10

     "by updating the virtual time according to Eq. (18,19) we
      ensure that the sum over the lags of all active clients
      is always zero"  -- Page 11

But in Peter's implementation, it is the competitors in the new group
that client 3 later joins in who actually support the effect. So when
client 3 leaves competition with !0-lag in Linux, the rq's sum(lag_i)
is no longer zero.


I've different opinions. According to the comments above avg_vruntime_add(), V
is calculated exactly to satisfy sum(lag_i)=0. This is guaranteed by math.

Yes, you are right. I mixed another fairness issue with this. What I
was thinking is that considering multiple competition groups (e.g.
runqueues), the latency bound could be violated, that is someone could
starve a bit. Say one entity even with positive lag could become less
competitive if migrated to a higher competitive group.

Staring at Eq. (20) again, what if we do a fake reweight? I mean let
the client leave and rejoin at the same time without changing weight?
IMHO it should have no effects, but according to Eq. (20) the V will
change to:

    V' = V + lag(j)/(W - w_j) - lag(j)/W != V

Have I missed anything?


Good point! I've not ever noticed this conflict.

I tried to modify reweight_entity() to run dequeue_entity() -> adjust se->vlag ->
enqueue_entity(). And I found V do not changed.

The difference is, when doing enqueue_entity(), Peter enlarges the lag in place_entity().
Because after enqueue, the lag will evaporate.
In order to keep the same lag after enqueue, during place_entity(),
the new lag(t) will be enlarged with (W+w_i)/W.

So the Eq. (20) should be:


V' = V + lag(j)/(W - w_j) - lag'(j)/(W - w_j + w'_j)

lag'(j) = lag(j) * (W - w_j + w'_j)/(W - w_j)

So we can get

V' = V + lag(j)/(W - w_j) - lag(j) * (W - w_j + w'_j)/(W - w_j)/(W - w_j + w'_j) = V

So COROLLARY #2 is correct.