Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Paul E. McKenney
Date: Sat Aug 18 2007 - 19:20:00 EST


On Sat, Aug 18, 2007 at 03:41:13PM -0700, Linus Torvalds wrote:
>
>
> On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> >
> > One of the gcc guys claimed that he thought that the two-instruction
> > sequence would be faster on some x86 machines. I pointed out that
> > there might be a concern about code size. I chose not to point out
> > that people might also care about the other x86 machines. ;-)
>
> Some (very few) x86 uarchs do tend to prefer "load-store" like code
> generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can
> actually be faster on some of them. Not any that are relevant today,
> though.

;-)

> Also, that has nothing to do with volatile, and should be controlled by
> optimization flags (like -mtune). In fact, I thought there was a separate
> flag to do that (ie something like "-mload-store"), but I can't find it,
> so maybe that's just my fevered brain..

Good point, will suggest this if the need arises.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/