Re: [PATCH 0/24] make atomic_read() behave consistently across allarchitectures

From: Chris Snook
Date: Tue Aug 21 2007 - 10:01:42 EST


David Miller wrote:
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)

Ie a "barrier()" is likely _cheaper_ than the code generation downside from using "volatile".

Assuming GCC were ever better about the code generation badness
with volatile that has been discussed here, I much prefer
we tell GCC "this memory piece changed" rather than "every
piece of memory has changed" which is what the barrier() does.

I happened to have been scanning a lot of assembler lately to
track down a gcc-4.2 miscompilation on sparc64, and the barriers
do hurt quite a bit in some places. Instead of keeping unrelated
variables around cached in local registers, it reloads everything.

Moore's law is definitely working against us here. Register counts, pipeline depths, core counts, and clock multipliers are all increasing in the long run. At some point in the future, barrier() will be universally regarded as a hammer too big for most purposes. Whether or not removing it now constitutes premature optimization is arguable, but I think we should allow such optimization to happen (or not happen) in architecture-dependent code, and provide a consistent API that doesn't require the use of such things in arch-independent code where it might turn into a totally superfluous performance killer depending on what hardware it gets compiled for.

-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/