Re: Sum of weights idea for CFS PI

From: Qais Yousef
Date: Wed Oct 05 2022 - 06:04:13 EST


Hi Joel

On 10/04/22 16:27, Joel Fernandes wrote:

[...]

> I am treating the following the same:
>
> a. when A is running, it would be as above.
> b. but if A was sleeping, B, C, and D would get 1/3.
>
> similar to
>
> a. when A is running *and blocked on C for all its runtime*
> ^^ -- in this case, B and D should not have their distributions
> changed at all because they are not participating in the
> lock acquire and release. So they should neither be hurt
> any more, nor be boosted. They should simply stay same [1]
>
> b. but if A was sleeping, B, C, and D would get 1/3.
>
>
> [1] Why? Consider 3 tasks in the all-RT case, A high, B medium and C low prio.
>
> If all are running 100% and A does not block on C, B is blocked by A
> indefinitely. So the prio of A and B are inverted. We seek to rectify this, that
> is we need make changes such that, B is returned back to the blocked state. We
> do this by boosting C.
>
> In other words, the prio inheritance will cause B's distribution to not be
> changed (it was supposed to be blocked before and it is now going to be blocked
> state again).
>
> CFS should not behave any differently, B's distribution should not be changed
> before/after the priority inhertiance of A by C. That's just my opinion - and
> that's how I calculated to distribution. With that mind, could you go back to
> seeing if my math was originally correct or did I mess something up?

It's not about the math. But I think the before and after can't be the same for
C..

> I do think though that Youssef's point of not worrying about being too accurate
> is reasonable if the critical sections are short lived but I'm not sure.

.. I do agree with that as well. I was just trying to highlight that looking at
average can be misleading and I don't see C taking too much time.

If any worries I have, it'd be not accounting correctly for the stolen time
C takes from A. Otherwise A + C share combined would be higher than it should
be. Which might be the problem you're trying to highlight but I am unable to
get/see. But this is an implementation detail and an artefact of wrong
accounting, not how shares are summed.

> > I don't think this is valid. If A is blocked on C for 50% of the time, and
> > sleeping for 50% of the time, when did it get blocked/unblocked?
> >
> > This will have an impact on the average share for C and skew it, no?
> >
> > Unless I missed something, the average share of C being (3/5 + 1/3) is an
> > impossible state. You need to consider the portion of time when C runs as 1/5,
> > when A is actually not blocked on anything, too.
> >
> > Hmm actually I just re-read your statement below and you just say 3/5 (18/30)
> > is too much. You didn't consider the average. I'll leave the above in hope to
> > help me understand what am I missing and where I went wrong :-)
> >
> > Generally IMHO looking at the average will not help. I think if the share
> > values make sense in each state individually (and I believe they are), that
> > would be enough. AFAICS, B and D are still taking the right amount of time when
> > C inherits the bandwidth. And C by definition will run longer when A is blocked
> > on it for the whole duration of this blocked time.
>
> I was degenerating the case where A sleeps (say I/O) vs A blocks, to simplify
> the math, and then taking average of that. I think that's reasonable?

I'm not sure. This is skewing the results in my view.

I think the comparison should just be:

1) A, B, C, and D are all running and nothing gets blocked at all. Then shares
would be:

2/5, 1/5, 1/5, 1/5

2) A is blocked and C; B, C, D are running with no blocked time. Shares would
be:

- , 1/5, 3/5, 1/5

By definition, we want to treat A in (2) as RUNNING because as soon as
C unblocks A we should return to (1). From B and D perspective, their share is
not impacted throughout this transition. Which is AFAIU is what we want to
achieve.

I think considering the sleeping time and averaging can lead to misleading
results if care is not taken.

Anyway - just trying to explain how I see it and why C is unlikely to be taking
too much time. I could be wrong. As Youssef said, I think there's no
fundamental problem here.

My 2 cents ;-)


Thanks!

--
Qais Yousef