Re: 2.2.13 & gcc-2.95.1

Horst von Brand (vonbrand@inf.utfsm.cl)
Mon, 20 Sep 1999 11:01:13 -0400


Mikael Pettersson <mikpe@csd.uu.se> said:
> [Somewhat off-topic, but gcc problems are relevant for kernel folks too.]
> I Lee Hetherington wrote:
> >Mikael Pettersson wrote:
> >> This patch for 2.2.x (originally by Artur Skawina for 2.3.x)
> >> eliminates at least one problem of using gcc-2.95, namely its
> >> utterly stupid and unnecessary 16-byte stack alignment default.

> >Why is it utterly stupid and unnecessary?

> It's unnecessary because the desired effect, stricter alignment of
> certain data objects, can be achieved manually when and where it matters.

It is easier to just ensure the stack is sanely aligned all the time. This
means at most 14 bytes/call for a noticeable performance increment
(floating point mostly, IIRC). In the kernel, with its minuscule stacks,
the wasted memory sure hurts, elsewhere it doesn't matter at all.

[...]

> > To ensure proper alignment of this values on the stack, the stack
> > boundary must be as aligned as that required by any value stored
> > on the stack. Further, every function must be generated such that
> > it keeps the stack aligned.
>
> This is simply not true.
>
> 1) Proper alignment of these data types can easily be achieved manually,
> without imposing any restrictions on the "stack boundary" (i.e. SP/FP).
> It is not necessary to keep SP 16-byte aligned at each procedure call.
>
> char buffer[N*sizeof(T)+sizeof(T)-1];
> T *arr = (T*)(((unsigned long)buffer+sizeof(T)-1) & ~(sizeof(T)-1));
> /* now rejoice knowing arr[0..N-1] are all properly aligned */

You may write that, others don't want to go over legacy code and add this
kind of black magic all over the place, just for the case they might
compile with gcc on ia32.

> 2) Application programmers that _really_ need FP or SSE performance
> are usually no newbies. They know that they may have to adapt/tweak
> their "codes" and "kernels" (FORTRAN-speak) to the hardware at hand.
> At the very least, they know how to add "-mpreferred-stack-boundary=4"
> to their Makefiles.

It is better for everybody involved if you _don't_ have to RTFM looking for
obscure flags that are essential for decent performance.

If you really see a decrease in performance in real-world code, contact the
egcs folks and discuss it with them.

-- 
Dr. Horst H. von Brand                       mailto:vonbrand@inf.utfsm.cl
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/