Re: CPU scheduler weirdness?

From: Marton Balint
Date: Thu Aug 20 2009 - 12:56:16 EST




On Thu, 20 Aug 2009, Ingo Molnar wrote:


* Marton Balint <cus@xxxxxxxxxx> wrote:


On Wed, 19 Aug 2009, Peter Zijlstra wrote:

On Wed, 2009-08-19 at 14:34 +0200, Marton Balint wrote:

On Wed, 19 Aug 2009, Peter Zijlstra wrote:

On Wed, 2009-08-19 at 14:01 +0200, Marton Balint wrote:
On Wed, 19 Aug 2009, Peter Zijlstra wrote:
On Tue, 2009-08-18 at 21:49 +0200, Marton Balint wrote:

In the meantime, I was able to create a tiny C program which always
succesfully reproduces the bug. It's basically an endless loop which does
not stop while the process is running on the last CPU core. The program
creates multiple instances of itself, to be able to keep all of the CPU
cores busy. After 1 second, the processes running on other than the last
CPU core die, the processes running on the last CPU core remain stuck
there...

I tested it on my dual core system, if someone could test it on a quad
core and report back that would probably be useful.

Usage: ./schedtest <number of CPU cores>

And don't forget to kill the stuck processes after using the program! :)

So what's the bug? Sure one task will stay on the cpu, and because there
is no contention it doesn't get migrated, and therefore won't quit,
how's that a problem?

Problem is that more than one processes remain on that CPU core, and none
of them get migrated to other (idle) cores. I tested it with my E8400
processor and 2.6.31-rc5-git3 kernel.

Only one remains here.. on a c2q running 2.6.31-rc6-tip

Do you have a .config handy?


Yes it's in my original post:

http://marc.info/?l=linux-kernel&m=125012584709800&w=2

Right you are,.. so I build a kernel with the cgroup scheduler in and
tested it on a dual-core opteron machine, but I can't seem to reproduce
this.

Are you using cgroups in any way, or do you simply have it enabled in
your config?

No, it's just enabled. Actually the kernel is from the
openSUSE build service:

http://download.opensuse.org/repositories/Kernel:/HEAD/openSUSE_11.1/x86_64/

But the problem is present for both the kernel-default
kernel and the kernel-vanilla kernel which does not
contain any suse-specific patches.

This evening I had a bit more time to test, and I've
made a surprising discovery: I can only reproduce the
bug if the kernel module of my TV tuner card is loaded.
I have a Leadtek Winfast 2000 XP Expert TV card, it
uses the cx8800 kernel module. It seems that the
problem is somehow related to the infrared sensor of
the TV card, because I recompiled the module with the
'case CX88_BOARD_WINFAST2000XP_EXPERT:' line removed
from cx88-input.c and I couldn't reproduce the bug with
the new kernel module.

Extremely weird. Are timers somehow busted?

How can I check that?

In the meantime, I updated my original C program and also created a kernel module (schedtest_mod.c) which causes the same scheduling problems as the kernel module of my TV card. The kernel module is a skeleton of the infrared sensor polling code in cx88-input.c. It uses schedule_delayed_work, this seems to cause the problem. The C program (schedtest.c) is also updated, it now detects the number of CPU cores, from now, what you can set as a command line parameter is the CPU core number, on which the schedtest processes will not quit. (previously this was always the last core).

So to reproduce the bug on a dual core system, compile and insert the kernel module (schedtest_mod.c). Then check dmesg, it should contain on which CPU core is the delayed_work running. You should use the CPU core id of the _other_ CPU core as a command line parameter to the updated schedtest program.

And by the way, thank you guys for the help so far, hopefully we'll get to the bottom of this :)

Regards,
Marton#include <linux/module.h>
#include <linux/init.h>
#include <linux/workqueue.h>
#include <asm/smp.h>

static int i;
static struct delayed_work d_work;

static void schedtest_work(struct work_struct *work)
{
schedule_delayed_work(&d_work, msecs_to_jiffies(1));
if (i++ % 500 == 0) {
printk(KERN_DEBUG "schedtest: I am on CPU %d.\n", get_cpu());
put_cpu();
}
}

static int __init schedtest_init_module(void)
{
INIT_DELAYED_WORK(&d_work, schedtest_work);
schedule_delayed_work(&d_work, 0);
return 0;
}

static void __exit schedtest_cleanup_module(void)
{
cancel_delayed_work_sync(&d_work);
}

module_init(schedtest_init_module);
module_exit(schedtest_cleanup_module);

MODULE_LICENSE("GPL");
#define _GNU_SOURCE
#include <utmpx.h>
#include <sys/time.h>
#include <unistd.h>

/* Usage: ./schedtest <cpu core to test> */

int miliseconds() {
struct timeval tv;
gettimeofday(&tv, 0);
return tv.tv_usec/1000;
}

int main(int argc, char *argv[]) {
int lives = 1000, time, lasttime, childs, cores, core_to_test;
cores = sysconf(_SC_NPROCESSORS_ONLN);
childs = cores * 2;
if (argc > 1)
core_to_test = atoi(argv[1]);
else
core_to_test = cores-1;
while (childs-- && !fork());
while (lives) {
time = miliseconds();
if (lasttime != time && sched_getcpu() != core_to_test)
lives--;
lasttime = time;
}
return 0;
}