Re: [PATCH] [RFC] [2.5 i386] GCC 3.1 -march support, PPRO_FENCE reduction, prefetch fixes and other CPU-related changes

From: Luca Barbieri (ldb@ldb.ods.org)
Date: Mon Aug 05 2002 - 03:12:06 EST


> I'm trying to understand why you think they are needed at all. Except
> for code that specifically does non-temporal we don't need fences on an
> X86, and the code that uses non temporal stores has its own fences built
> in.
>
> So as far as I can see the only cases we ever have to care about are
>
> PPro - processor bug
> IDT Winchip - because we run it in oostore module not strict x86 mode
>
> I don't see why you are generating extra fence instructions for other
> cases
>

__volatile__ and : : :"memory" omitted from asm statements

Both without and with patch:
- barrier(): asm("")

Without patch:
- mb(): asm("lock; addl $0,0(%%esp)")
- rmb(): asm("lock; addl $0,0(%%esp)")
- wmb: if(OOSTORE) asm("lock; addl $0,0(%%esp)") else barrier()

With patch:
- mb(): if(SSE2) asm("mfence") else asm("lock; addl $0,0(%%esp)")
- rmb(): if(SSE2) asm("lfence") else asm("lock; addl $0,0(%%esp)")
- wmb: if(OOSTORE) {if(MMXEXT) asm("sfence") else asm("lock; addl
$0,0(%%esp)")} else barrier()

So I'm only replacing the lock; addl $0,0(%%esp) with the Xfence
instructions which are more efficient.

As for the need for fences, based on the Intel documentation it seems
that we need read fences to read all hardware locations not mapped as
uncacheable and write fences for all memory locations mapped as write
combining.

Since drivers often map cacheable memory and then use rmb(), rmb()
cannot be made a nop.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:26 EST