Re: Saving syscall cycles on a Cyrix

geerten kuiper (
Sun, 15 Jun 1997 13:29:49 +0200

At 12:23 12/06/97 +0100, Mike Jagdis wrote:
>On Thu, 12 Jun 1997, Thomas Koenig wrote:
>> prompted by a discussion on comp.sys.development.system, I had a look
>> at entry.S, and found that this can be speeded up significantly for
>> a Cyrix.
>> The problem is that two pushl/popl instructions can't be paired in the
>> Cyrix' X and Y pipelines because the esp register gets modified
>> in both.
>> [...]
>> only takes four, because any two instructions can pair. The same
>> goes for the reverse, i.e. the popl instructions. Cyrix patch
>> maintainers, are you listening? :-)
>Most modern x86 processors have dual pipelines and can benefit
>from similar optimizations. The current gcc isn't particularly
>smart about interleaving code paths to avoid "bubbling" in the
>pipelines. I've been reading up on such tuning recently and, if
>anyone is interested, have hand tuned the rc5 cracking code to
>go from ~145K keys/s (on my machine) to ~205K keys/s (yes, you
>can get a big difference!). Of course, this is at the assembler
>level and the result is, ah, "not easily readable" but even
>paying attention to the order you do things in C can show
>reasonable gains. Whether such things are good for MIPS, Sparc,
>Alpha, PPC etc. as well is another question :-).
> Mike

You may be interested in David Mosberger's paper on optimizations for
Alpha. Some of them work quite well for the higher end x86's as well.

>For those of you who couldn't make it to LinuxExpo '97 but are
>interested in making code run fast, my paper is now available at:
>(in addtion to all the other papers available at
> I appended the title &
>abstract below.
> --david


Geerten Kuiper | "It is customary to append a signature or
Corn. Houtmanstraat 113 | .sig to a mail message, usually containing
2593 RG Den Haag | information on the author, along with a
Nederland | joke or a motto."
| Olaf Kirch
| LINUX Network Administrators Guide