Re: Shared memory not SMP safe to user-mode code.

Richard B. Johnson (root@chaos.analogic.com)
Tue, 30 Nov 1999 11:59:25 -0500 (EST)


On Tue, 30 Nov 1999, Alan Cox wrote:

> > The following program shows that shared memory is not SMP safe.
> > It is my understanding that, from the user's code point-of-view,
> > the fact that a machine may have two (or more) processors should
> > be invisible.
>
> Not entirely.
>
> > It appears as though, on a SMP machine, shared memory is not the same
> > page shared amongst tasks. It appears (emphasis upon appears) as though
> > shared-memory is a copy of something that does not get updated properly
> > after a write (like it's in one CPU's cache, but not in another). I
> > think one has to invalidate the cache after a write to shared memory??
>
> Your code is broken.
>
> > #define cli __asm__ __volatile__("cli\n")
> > #define sti __asm__ __volatile__("sti\n")
>
> These only affect the local CPU
>
> > while((key = pars->key++) == 0)
>
> This isnt SMP safe
>
> > pars->spin += key;
>
> This isnt SMP safe etc..
>
>

Well this is all wonderful, but there can only be one CPU per task at
any one instant in time. If there are two CPUs mucking with the same
task, something other than the demo code is broken.

If you look at the code, it doesn't give a damn what the result of
"pars->spin += key" is, as long as it can be undone with "pars->spin -=
key". Further, 'key' is local to a task. It doesn't care what the
value is as long as it's non-zero.

Further, the cli and sti macros are not even necessary. They were
just added (along with iopl) in an attempt to verify the problem

Neither neither "pars->spin" nor "pars->key" can be partially updated
before an interrupt occurs. They can be interrupted before or after
the summation, but not in the middle. The code purposely increments
and decrements the shared-memory variable, and will do it continually
until a point is found where there is no CPU contention. That is
the purpose of it. It is expected that there will be lots of loops
through the code where both the child and the parent are mucking with
the same variable. When this is discovered, the contenders subtract
their keys and sleep for a 'random' period of time so resolve the
deadlock.

The problem is not that deadlocks occasionally occur. The problem
is that, on a SMP machine, what was written by one CPU does not
appear in shared memory when viewed by another CPU. This is a bug.

Cheers,
Dick Johnson

Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/