Re: New Intel Bug! Only affects memory read performance

Jeffrey B. Siegal (jbs@quiotix.com)
Thu, 25 Dec 1997 00:17:28 -0800


Their results are indeed dramatic (71% increase in main memory read
bandwiddth, 51% increase in L2 cache read bandwidth). However, the issue may
be whether the data could already be in L1 cache. If so, the convoluted code
required to work around the "bug" will almost certainly decrease performance.
This is similar to the Pentium memcpy patch, which is only advantageous if
the destination is not in L1. It is actually a lot slower when the
destination is in L1, which may account for the mixed results that people
have reported. Supposedly (but I haven't investigated this myself) the MMX
instructions are a lot faster than the floating point instructions; using
them might produce a Pentium (MMX) memcpy patch with less of a penalty when
the destination is in L1.

Maybe the thing to do is to make available special versions of the memory
routines (memcpy and friends) which should be used when the approprate cache
behavior is desired. All the routines can in fact be the same code when
running on an architecture without specially tuned versions.