Re: [PATCH 09/10] x86-32: use SSE for atomic64_read/set if available

From: Luca Barbieri
Date: Thu Feb 18 2010 - 05:27:16 EST


> CR changes are slow and synchronize the CPU. The later is always slow.
>
> It sounds like you didn't time it?
I didn't, because I think it strongly depends on the microarchitecture
and I don't have a comprehensive set of machines to test on, so it
would just be a single data point.

The lock prefix on cmpxchg8b is also serializing so it might be as bad.

Anyway, if we use this, we should keep TS cleared in kernel mode and
lazily restore it on return to userspace.
This would make clts/stts performance mostly moot.

I agree that this feature would need to added too before putting the
SSE atomic64 code in a released kernel.

> It'll generate worse code because gcc can't use these registers
> at all in the C code. Some gcc versions also tend to give up when they run
> out of registers too badly.
Yes, but the C implementations are small and simple, and are only used
on 386/486.
Furthermore, the data in the global register variables is the main
input to the computation.

> So why don't you simply use normal asm inputs/outputs?
I do, on the caller side.

In the callee, I don't see any other robust way to implement parameter
passing in ebx/esi other than global register variables (without
resorting to pure assembly, which would prevent reusing the generic
atomic64 implementation).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/