Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response

From: Marcelo Tosatti
Date: Wed Jan 04 2006 - 07:01:15 EST


Hi Peter,

On Wed, Jan 04, 2006 at 12:25:40PM +1100, Peter Williams wrote:
> Peter Williams wrote:
> >Helge Hafting wrote:
> >
> >>On Wed, Dec 21, 2005 at 05:32:52PM +1100, Peter Williams wrote:
> >>
> >>>Trond Myklebust wrote:
> >>
> >>
> >>[...]
> >>
> >>>>Sorry. That theory is just plain wrong. ALL of those case _ARE_
> >>>>interactive sleeps.
> >>>
> >>>
> >>>It's not a theory. It's a result of observing a -j 16 build with the
> >>>sources on an NFS mounted file system with top with and without the
> >>>patches and comparing that with the same builds with the sources on a
> >>>local file system. Without the patches the tasks in the kernel build
> >>>all get the same dynamic priority as the X server and other
> >>>interactive programs when the sources are on an NFS mounted file
> >>>system. With the patches they generally have dynamic priorities
> >>>between 6 to 10 higher than the X server and other interactive programs.
> >>>
> >>
> >>A process waiting for NFS data looses cpu time, which is spent on
> >>running something else. Therefore, it gains some priority so it won't be
> >>forever behind when it wakes up. Same as for any other io waiting.
> >
> >
> >That's more or less independent of this issue as the distribution of CPU
> >to tasks is largely determined by the time slice mechanism and the
> >dynamic priority is primarily about latency. (This distinction is a
> >little distorted by the fact that, under some circumstances,
> >"interactive" tasks don't get moved to the expired list at the end of
> >their time slice but this usually won't matter as genuine interactive
> >tasks aren't generally CPU hogs.) In other words, the issue that you
> >raised is largely solved by the time tasks spend on the active queue
> >before moving to the expired queue rather than the order in which they
> >run when on the active queue.
> >
> >This problem is all about those tasks getting an inappropriate boost to
> >improve their latency because they are mistakenly believed to be
> >interactive.
>
> One of the unfortunate side effects of this is that it can effect
> scheduler fairness because if these tasks get sufficient bonus points
> the TASK_INTERACTIVE() macro will return true for them and they will be
> rescheduled on the active queue instead of the expired queue at the end
> of the time slice (provided EXPIRED_STARVING()) doesn't prevent this).
> This will have an adverse effect on scheduling fairness.
>
> The ideal design of the scheduler would be for the fairness mechanism
> and the interactive responsiveness mechanism to be independent but this
> is not the case due to the fact that requeueing interactive tasks on the
> expired array could add unacceptably to their latency. As I said above
> this slight divergence from the ideal of perfect independence shouldn't
> matter as genuine interactive processes aren't very CPU intensive.
>
> In summary, inappropriate identification of CPU intensive tasks as
> interactive has two bad effects: 1) responsiveness problems for genuine
> interactive tasks due to the extra competition at their dynamic priority
> and 2) a degradation of scheduling fairness; not just one.
>
> For an example of the effect of inappropriate identification of CPU hogs
> as interactive tasks see the thread "[SCHED] Totally WRONG priority
> calculation with specific test-case (since 2.6.10-bk12)" in this list.

And another real-life example of the issue you describe above.