Fast 64-bit atomic writes (SSE?)

From: Roland Dreier
Date: Fri Mar 19 2004 - 19:41:55 EST


Hi, I'm trying to find the best (fastest) way to write a 64-bit
quantity atomically to a device (on i386). Taking a spinlock around
two 32-bit accesses seems to be rather expensive. I'm mostly
concerned about newish CPUs, so I'm willing to use SSE or SSE2
instructions (of course falling back to a slower locked path if the
kernel is built for an old CPU).

I guess I have two questions, a general one and a specific one.

General question:
What's the best way to do this?

Specific question:
I've come up with the below function. However, you may notice that
I'm forced to use movdqu (instead of movdqa, which is presumably
faster). This is because even with the __attribute__((aligned(16))),
my xmmsave array is not guaranteed to be aligned to 16 bytes. I could
just allocate 31 bytes for xmmsave and align it to 16 bytes by hand,
but that seems a little ugly. Is there some magic I'm missing?

Thanks,
Roland

static inline void mywrite64(u32 *val, void *dest)
{
u8 xmmsave[16] __attribute__((aligned(16)));

/* The #ifs are a hack to deal with 2.4 kernels without preempt. */
#if CONFIG_PREEMPT
preempt_disable();
#endif

/* We use movdqu for the moment, because even
__attribute__((aligned(16))) doesn't seem to guarantee
xmmsave is aligned to a 16 byte boundary. */
__asm__ __volatile__ (
"movdqu %%xmm0,(%0); \n\t"
"movq (%1),%%xmm0; \n\t"
"movq %%xmm0,(%2); \n\t"
"movdqu (%0),%%xmm0; \n\t"
:
: "r" (xmmsave), "r" (val), "r" (dest)
: "memory" );

#if CONFIG_PREEMPT
preempt_enable();
#endif
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/