Re: [PATCH 0/24] make atomic_read() behave consistently across allarchitectures

From: Linus Torvalds
Date: Sat Aug 18 2007 - 18:49:25 EST




On Sat, 18 Aug 2007, Paul E. McKenney wrote:
>
> One of the gcc guys claimed that he thought that the two-instruction
> sequence would be faster on some x86 machines. I pointed out that
> there might be a concern about code size. I chose not to point out
> that people might also care about the other x86 machines. ;-)

Some (very few) x86 uarchs do tend to prefer "load-store" like code
generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can
actually be faster on some of them. Not any that are relevant today,
though.

Also, that has nothing to do with volatile, and should be controlled by
optimization flags (like -mtune). In fact, I thought there was a separate
flag to do that (ie something like "-mload-store"), but I can't find it,
so maybe that's just my fevered brain..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/