Re: Schedulers benchmark - Was: [ANNOUNCE][RFC] PlugSched-5.2.4 for2.6.12 and 2.6.13-rc6

From: Peter Williams
Date: Wed Aug 17 2005 - 18:17:05 EST

Next message: Lee Revell: "Re: [rfc][patch] API for timer hooks"
Previous message: Davda, Bhavesh P (Bhavesh): "Debugging kernel semaphore contention and priority inversion"
In reply to: Con Kolivas: "Re: Schedulers benchmark - Was: [ANNOUNCE][RFC] PlugSched-5.2.4 for 2.6.12 and 2.6.13-rc6"
Next in thread: Con Kolivas: "Re: Schedulers benchmark - Was: [ANNOUNCE][RFC] PlugSched-5.2.4 for 2.6.12 and 2.6.13-rc6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Con Kolivas wrote:

On Wed, 17 Aug 2005 18:10, Peter Williams wrote:

Michal Piotrowski wrote:

Hi,
here are schedulers benchmark (part2):
[bits deleted]

Here's a summary of your output generated using the attached Python script.

| Build Statistics | Overall Statistics

-----------------------------------------------------------------------
Scheduler| Real CPU SYS TPT | CPU TPT delay CXSW

| (secs) (secs) (%) (%) | (secs) (%) (secs)

-----------------------------------------------------------------------
ingosched| 3128.5 5056.3 8.18 161.6 | 5379.5 171.9 159367.4 1556452
staircase| 3131.2 5032.6 8.09 160.7 | 5352.9 170.9 135193.0 1670366
spa_no_frills| 3103.8 5049.5 7.98 162.7 | 5266.7 169.7 172384.8 520937
zaphod(d,d)| 3561.7 4823.8 9.25 135.4 | 5132.0 144.1 148361.5 1771617
zaphod(d,0)| 3551.2 4809.9 9.19 135.4 | 5114.7 144.0 144022.0 1784814
zaphod(0,d)| 3126.8 5063.2 8.11 161.9 | 5278.1 168.8 173438.4 573587
zaphod(0,0)| 3105.5 5052.9 7.98 162.7 | 5254.8 169.2 165774.4 577534
nicksched| 3294.7 5095.1 9.10 154.6 | 5425.4 164.6 104298.2 2205665

where the (x,y) after zaphod means (max_ia_bonus, max_tpt_bonus) and "d"
means default. I had to kill a few significant digits to squeeze it
into 71 columns. Overall statistics are extracted from the schedstats
data. In the "Build Statistics" "CPU" is the sum of the user and sys
times and "SYS" is the percentage of that which was sys time (as I feel
that is a better thing to compare than raw sys times).

I was intrigued by the fact that zaphod(d,d) and zaphod(d,0) take longer
in real time but use less cpu. I was assuming that this meant that some
other job was getting some cpu but the schedstats data doesn't support
that. Also it wouldn't make sense anyway as you'd expect jobs doing the
same amount of work to use roughly the same amount of cpu. My latest
theory is that your machine has hyper threads and this artifact is
caused by the mechanism in the scheduler for handling tasks with
differing priority in sibling hyper thread channels. Does your system
have hyper threads?

That would only do something if there was a difference in 'nice' levels.

Not in zaphod and spa_no_frills. They user dynamic priority. I may rethink this as the argument for using dynamic priority mainly applies to the entitlement based mode of zaphod.

What you're seeing is the fact that balancing is intimately tied in with timeslice size and you have increased idle time.

I partially agree in that reducing the time slice size would reduce the size of the effect but it's not the cause of the effect. The hyper threading code is the cause.

BTW I'm wondering why the TPT column (i.e. cpu time / real elapsed time) is so low for all schedulers. It's been a long time since I ran your "contest" benchmarks for any of these schedulers but I seem to recall that they all did a lot better than this when I extracted the equivalent data from the output. Generally, I think that they all used greater than 95% of the available cpu time which would be the equivalent of TPT values of 190% or more in this case.

Another interesting thing to be noted in these numbers is that the cost of the extra context switches caused by the "improved interactive performance" measures doesn't seem to be very significant.

It also looks as if the overhead for zaphod's IA bonus mechanism needs to be reduced.

Peter
--
Peter Williams pwil3058@xxxxxxxxxxxxxx

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Lee Revell: "Re: [rfc][patch] API for timer hooks"
Previous message: Davda, Bhavesh P (Bhavesh): "Debugging kernel semaphore contention and priority inversion"
In reply to: Con Kolivas: "Re: Schedulers benchmark - Was: [ANNOUNCE][RFC] PlugSched-5.2.4 for 2.6.12 and 2.6.13-rc6"
Next in thread: Con Kolivas: "Re: Schedulers benchmark - Was: [ANNOUNCE][RFC] PlugSched-5.2.4 for 2.6.12 and 2.6.13-rc6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]