On 2024/2/29 22:25, Abel Wu wrote:
Good catch. And to the best of my knowledge, the answer is YES. The
above Equation in the paper, which is Eq. (20), is based on the
assumption that:
"once client 3 leaves, the remaining two clients will
proportionally support the eventual loss or gain in the
service time" -- Page 10
"by updating the virtual time according to Eq. (18,19) we
ensure that the sum over the lags of all active clients
is always zero" -- Page 11
But in Peter's implementation, it is the competitors in the new group
that client 3 later joins in who actually support the effect. So when
client 3 leaves competition with !0-lag in Linux, the rq's sum(lag_i)
is no longer zero.
I've different opinions. According to the comments above avg_vruntime_add(), V
is calculated exactly to satisfy sum(lag_i)=0. This is guaranteed by math.
Actually I print some logs in enqueue_entity() and dequeue_entity() to verify this:
[ 293.261236] before dequeue: V=2525278131 W=3072 v=2526243139 w=1024 lag_sum=0
[ 293.261237] after dequeue: V=2524795627 W=2048 v=2526243139 w=1024 lag_sum=0
[ 293.262286] before enqueue: V=2525319064 W=2048 v=2526766576 w=1024 lag_sum=0
[ 293.262287] after enqueue: V=2525801568 W=3072 v=2526766576 w=1024 lag_sum=0
For the first 2 lines, we have 2524795627 = 2525278131 + (2525278131 - 2526243139) * 1024 / 2048.
Which is Eq. (18)
For the last 2 lines, we have 2525801568 = 2525319064 - (2525319064 - 2526766576) * 1024 / 3072.
Which is Eq. (19)
So whatever client 3 leave or join competition with !0-lag in Linux, V is handled properly.
Good catch again! It smells like a bug. Since this @se is still on_rq,
it should be taken into consideration when calculating avg_runtime(),
but in fact it isn't because __dequeue_entity() will remove its share.
And I seem to spot another bug, although not relate to this problem,
that we actually need to call update_curr() unconditionally if curr is
available, because we need to commit curr's outstanding runtime to
ensure the result of avg_runtime() is up to date.
I've tried to record avg_vruntime before __dequeue_entity() and pass it to
reweight_eevdf(). Then the issue is fixed. The V keeps the same during the whole
reweight_entity().
I could send these two bugfix patches (one for this bug and one you sugguested