Re: [PATCH -tip] fix race between stop_two_cpus and stop_cpus

From: Mel Gorman
Date: Fri Nov 01 2013 - 09:44:34 EST


On Fri, Nov 01, 2013 at 07:36:36AM -0400, Rik van Riel wrote:
> On 11/01/2013 07:08 AM, Mel Gorman wrote:
> > On Thu, Oct 31, 2013 at 04:31:44PM -0400, Rik van Riel wrote:
> >> There is a race between stop_two_cpus, and the global stop_cpus.
> >>
> >
> > What was the trigger for this? I want to see what was missing from my own
> > testing. I'm going to go out on a limb and guess that CPU hotplug was also
> > running in the background to specifically stress this sort of rare condition.
> > Something like running a standard test with the monitors/watch-cpuoffline.sh
> > from mmtests running in parallel.
>
> AFAIK the trigger was a test that continuously loads and
> unloads kernel modules, while doing other stuff.
>

ok, thanks.

> >> + wait_for_global:
> >> + /* If a global stop_cpus is queuing up stoppers, wait. */
> >> + while (unlikely(stop_cpus_queueing))
> >> + cpu_relax();
> >> +
> >
> > This partially serialises callers to migrate_swap() while it is checked
> > if the pair of CPUs are being affected at the moment. It's two-stage
>
> Not really. This only serializes migrate_swap if there is a global
> stop_cpus underway.
>

Ok, I see your point now but still wonder if this is too specialised
for what we are trying to do. Could it have been done with a read-write
semaphore with the global stop_cpus taking it for write and stop_two_cpus
taking it for read?

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/