Re: [PATCH} - There appears to be a minor race condition in sched.c

From: Piet Delaney
Date: Thu Mar 26 2009 - 16:46:38 EST


Balbir Singh wrote:
* Piet Delaney <piet.delaney@xxxxxxxxxxxxx> [2009-03-25 20:46:11]:

Ingo, Peter:

There appears to be a minor race condition in sched.c where
you can get a division by zero. I suspect that it only shows
up when the kernel is compiled without optimization and the code
loads rq->nr_running from memory twice.

It's part of our SMP stabilization changes that I just posted to:

git://git.kernel.org/pub/scm/linux/kernel/git/piet/xtensa-2.6.27-smp.git

I mentioned it to Johannes the other day and he suggested passing it on to you ASAP.


The latest version uses ACCESS_ONCE to get rq->nr_running and then
uses that value. I am not sure what version you are talking about, if
it is older, you should consider backporting from the current version.

Hi Balbir:

It appears that Steven Rostedt changed cpu_ave_load_per_task() to use a local
variable nr_running, just as I suggested, apparently back in 2.6.28-rc5
last Nov; well after the 2.6.27 that I mentioned above.

A few days later Ingo added the ACCESS_ONCE() after Linus pointed out
that nothing prevented the compiler from reloading rg->rn_running.
Linus was right, adding the volatile is necessary to prevent gcc
from doing forward substitution.

I'll check Linus's current repo next time before suggesting bug fixes.

-piet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/