Re: [PATCH] CFS: Fix missing digit off in wmult table

From: Ingo Molnar
Date: Mon Jul 16 2007 - 17:19:32 EST



* Matt Mackall <mpm@xxxxxxxxxxx> wrote:

> More dynamic range is better? If you actually want a task to get 20x
> the CPU time of another, the older scheduler doesn't really allow it.
>
> Getting 1/69th of a modern CPU is still a fair number of cycles.
> Nevermind 1/69th of a machine with > 64 cores.

yeah. furthermore, nice -20 is only admin-selectable.

Here are the current CPU-use values for positive nice levels:

nice 0: 100.00%
nice 1: 80.00%
nice 2: 64.10%
nice 3: 51.28%
nice 4: 40.98%
nice 5: 32.78%
nice 6: 26.24%
nice 7: 21.00%
nice 8: 16.77%
nice 9: 13.42%
nice 10: 10.74%
nice 11: 8.59%
nice 12: 6.87%
nice 13: 5.50%
nice 14: 4.39%
nice 15: 3.51%
nice 16: 2.81%
nice 17: 2.25%
nice 18: 1.80%
nice 19: 1.44%

here's the CPU utilization table for negative nice levels (relative to a
nice -20 task):

nice 0: 1.15%
nice -1: 1.44%
nice -2: 1.80%
nice -3: 2.25%
nice -4: 2.81%
nice -5: 3.51%
nice -6: 4.39%
nice -7: 5.50%
nice -8: 6.87%
nice -9: 8.59%
nice -10: 10.74%
nice -11: 13.42%
nice -12: 16.77%
nice -13: 21.00%
nice -14: 26.24%
nice -15: 32.78%
nice -16: 40.98%
nice -17: 51.28%
nice -18: 64.10%
nice -19: 80.00%
nice -20: 100.00%

these are pretty sane, and symmetric across the origo. Nice -20 is the
odd one out, because there is no nice +20. But its value is still
logical, it's the mirror image of an imaginery nice +20.

and note that even on the old scheduler, nice-0 was "3200% more
powerful" than nice +19 (with CONFIG_HZ=300), and nice -19 was only 700%
more powerful than nice-0. So not only was it inconsistent (and i can
create scary numbers too ;), it gave the admin-controlled negative nice
levels less of a punch than to user-controlled nice +19. A number of
people complainted about that, and CFS addresses this.

in fact i like it that nice -20 has a slightly bigger punch than it used
to have before: it might remove the need to run audio apps (and other
multimedia apps) under SCHED_FIFO. (SCHED_FIFO is unprotected against
lockups, while under CFS a nice 0 task is still starvation protected
against a nice -20 task.)

furthermore, there is a quality of implementation issue as well, look at
the definition of the nice system call:

asmlinkage long sys_nice(int increment)

the "increment" is relative. So nice(1) has the same behavioral effect
under CFS, regardless of which nice level you start out from. Under the
old scheduler, the result depended on which nice level you started out
from.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/