Re: [announce] [patch] ultra-scalable O(1) SMP and UP scheduler

From: Anton Blanchard (
Date: Fri Jan 04 2002 - 05:30:18 EST

Great stuff Ingo!

> now that new-year's parties are over things are getting boring again. For
> those who want to see and perhaps even try something more complex, i'm
> announcing this patch that is a pretty radical rewrite of the Linux
> scheduler for 2.5.2-pre6:

Good timing :) We were just looking at an application that hit sched_yield
heavily on a large SMP machine (dont have the source so fixing the
symptoms not the cause). Performance was pretty bad with the standard

We managed to get a 4 way ppc64 machine to boot, but unfortunately the
18 way hung somewhere after smp initialisation and before execing
init. More testing to come :)

Is the idle loop optimisation (atomic exchange -1 into need_resched
to avoid IPI) gone forever?

Is my understanding of this right?:



char bitmap[BITMAP_SIZE+1];
list_t queue[MAX_RT_PRIO];

You have an bitmap of size MAX_RT_PRIO+1 (actually I think you have one too
many +1 here :) and you zero the last bit to terminate it. You use
find_first_zero_bit to search the entire priority list and
sched_find_first_zero_bit to search only the first MAX_PRIO (64) bits.

> the SMP load-balancer can be extended/switched with additional parallel
> computing and cache hierarchy concepts: NUMA scheduling, multi-core CPUs
> can be supported easily by changing the load-balancer. Right now it's
> tuned for my SMP systems.

It will be interesting to test this on our HMT hardware.

> - the SMP idle-task startup code was still racy and the new scheduler
> triggered this. So i streamlined the idle-setup code a bit. We do not call
> into schedule() before all processors have started up fully and all idle
> threads are in place.

I like this cleanup, it pushes more stuff out the arch specific code
into init_idle().

> another benchmark measures sched_yield() performance. (which the pthreads
> code relies on pretty heavily.)

Can you send me this benchmark and I'll get some more results :)

I dont think pthreads uses sched_yield on spinlocks any more, but there
seems to be enough userspace code that does.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Mon Jan 07 2002 - 21:00:24 EST