Re: _fpstate_fxsave & al

From: Gareth Hughes (gareth@precisioninsight.com)
Date: Wed Jun 07 2000 - 08:32:01 EST


Ulrich Drepper wrote:
>
> That's not true. fxsave is mainly there to descrease context
> switching time (and supporting the SSE registers). Delivering signals
> is a very slow process and a few more cycles spend on setting up data
> structures do not really matter. I already said I would complain too
> much if the old fp state is left alone and initialized with all zeroes
> or so. But the stack layout should look something like this:
>
> stack end --> fxsave context end
> ...
> fxsave context begin
> fsave context end
> ...
> fsave context begin
>
> sigcontext end
> ...
> sigcontext begin
> stack low -->
>
> I.e., _libc_fpstate should be defined as
>
> struct _libc_fpstate
> {
> struct _libc_fpstate_fsave fsave;
> struct _libc_fpstate_fxsave fxsave;
> };
>
> The ucontext_t struct could stay as it is (after replacing
> _libc_fpstate with _libc_fpstate_fsave) even though the memory for
> the__fpregs_mem part is allocated as a _libc_fpstate object (i.e.,
> will also include the fxsave stuff).
>
> magic should go somewhere else (csseg or wherever). Note that csseg
> would automatically be zero if you clear the fsave struct in case
> there is a fxsave struct.

Okay, okay :-) The more I think about it the more I like this idea.
How about this:

1) FXSAVE into the fxsave part of the above structure. This keeps FPU
exception flags intact.

2) FSAVE into the fsave part of the above structure. This will clear
the FPU exceptions, so must be done second. I'd prefer to just call
this rather than extract the info from the FXSAVE format.

3) Extract the status word from the FSAVE format, and set the fsave
part's status field.

4) Use the high word of the fsave's status field as the magic. I think
this will be the best place for it. This will be 0xffff (I'm fairly
sure FSAVE sets the high part of the status word as this) for the
regular FSAVE format and something else if the FXSAVE format is
included.

So, this will be compatible with applications that don't know about the
FXSAVE format info, and it can be safely determined if the FXSAVE format
info is available.

I like this method as it's still clean and there are no cruft in
extracting the FSAVE info from the FXSAVE format. Do you agree? I
apologize for being completely unreasonable and stubborn, I just got off
a long international flight :-)

-- Gareth

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 21:00:26 EST