Re: [RFC PATCH 00/11] Reviving the Proxy Execution Series

From: Juri Lelli
Date: Wed Oct 19 2022 - 10:02:44 EST


On 19/10/22 08:23, Joel Fernandes wrote:
>
>
> > On Oct 19, 2022, at 7:43 AM, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
> >
> > On 10/17/22 02:23, Joel Fernandes wrote:
> >
> >> I ran a test to check CFS time sharing. The accounting on top is confusing,
> >> but ftrace confirms the proxying happening.
> >>
> >> Task A - pid 122
> >> Task B - pid 123
> >> Task C - pid 121
> >> Task D - pid 124
> >>
> >> Here D and B just spin all the time. C is lock owner (in-kernel mutex) and
> >> spins all the time, while A blocks on the same in-kernel mutex and remains
> >> blocked.
> >>
> >> Then I did "top -H" while the test was running which gives below output.
> >> The first column is PID, and the third-last column is CPU percentage.
> >>
> >> Without PE:
> >> 121 root 20 0 99496 4 0 R 33.6 0.0 0:02.76 t (task C)
> >> 123 root 20 0 99496 4 0 R 33.2 0.0 0:02.75 t (task B)
> >> 124 root 20 0 99496 4 0 R 33.2 0.0 0:02.75 t (task D)
> >>
> >> With PE:
> >> PID
> >> 122 root 20 0 99496 4 0 D 25.3 0.0 0:22.21 t (task A)
> >> 121 root 20 0 99496 4 0 R 25.0 0.0 0:22.20 t (task C)
> >> 123 root 20 0 99496 4 0 R 25.0 0.0 0:22.20 t (task B)
> >> 124 root 20 0 99496 4 0 R 25.0 0.0 0:22.20 t (task D)
> >>
> >> With PE, I was expecting 2 threads with 25% and 1 thread with 50%. Instead I
> >> get 4 threads with 25% in the top. Ftrace confirms that the D-state task is
> >> in fact not running and proxying to the owner task so everything seems
> >> working correctly, but the accounting seems confusing, as in, it is confusing
> >> to see the D-state task task taking 25% CPU when it is obviously "sleeping".
> >>
> >> Yeah, yeah, I know D is proxying for C (while being in the uninterruptible
> >> sleep state), so may be it is OK then, but I did want to bring this up :-)
> >
> > I seem to remember Valentin raised similar issue about how userspace view can
> > get confusing/misleading:
> >
> > https://www.youtube.com/watch?v=UQNOT20aCEg&t=3h21m41s
>
> Thanks for the pointer! Glad to see the consensus was that this is not
> acceptable.
>
> I think we ought to write a patch to fix the accounting, for this
> series. I propose adding 2 new entries to proc/pid/stat which I think
> Juri was also sort of was alluding to:
>
> 1. Donated time.
> 2. Proxied time.

Sounds like a useful addition, at least from a debugging point of view.

> User space can then add or subtract this, to calculate things
> correctly. Or just display them in new columns. I think it will also
> actually show how much the proxying is happening for a use case.

Guess we'll however need to be backward compatible with old userspace?
Probably reporting the owner as running while proxied (as in the
comparison case vs. rtmutexes Valentin showed).