Re: [PATCH 198/208] x86/fpu: Document the various fpregs state formats

From: Ingo Molnar
Date: Wed May 06 2015 - 00:20:40 EST



* Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:

> On 05/05/2015 10:58 AM, Ingo Molnar wrote:
> > +/*
> > + * This is our most modern FPU state format, as saved by the XSAVE
> > + * and restored by the XRSTOR instructions.
> > + *
> > + * It consists of a legacy fxregs portion, an xstate header and
> > + * subsequent fixed size areas as defined by the xstate header.
> > + * Not all CPUs support all the extensions.
> > + */
> > struct xregs_state {
> > struct fxregs_state i387;
> > struct xstate_header header;
> > @@ -150,6 +169,13 @@ struct xregs_state {
> > /* New processor state extensions will go here. */
> > } __attribute__ ((packed, aligned (64)));
>
> Fenghua has a "fix" for this, but I think this misses a pretty big point.
>
> This structure includes only the "legacy" state, followed by the header.
> The remainder of the layout here is enumerated in CPUID leaves and can
> not be laid out in a structure because we do not know what it looks like
> until we run CPUID.
>
> There is logically a variable length array at the end of this
> sucker.

Yes, exactly, that is where we want to go, and this direction is what
I tried to cover with this bit of the series:

struct xregs_state {
struct fxregs_state i387;
struct xstate_header header;
u8 __reserved[XSTATE_RESERVE];
} __attribute__ ((packed, aligned (64)));

Note how it's now opaque after the xstate header, because there's no
guarantee of what's in that area.

The only 'fixed' aspect of the xstates is the feature bit enumeration:

enum xfeature_bit {
XSTATE_BIT_FP,
XSTATE_BIT_SSE,
XSTATE_BIT_YMM,
XSTATE_BIT_BNDREGS,
XSTATE_BIT_BNDCSR,
XSTATE_BIT_OPMASK,
XSTATE_BIT_ZMM_Hi256,
XSTATE_BIT_Hi16_ZMM,

XFEATURES_NR_MAX,

Plus with point #4 of the announcement I wanted to signal that I think
we should allocate the variable part dynamically:

4)

task->thread.fpu->state got embedded again, as
task->thread.fpu.state. This eliminated a lot of awkward late
dynamic memory allocation of FPU state and the problematic handling
of failures.

Note that while the allocation is static right now, this is a WIP
interim state: we can still do dynamic allocation of FPU state, by
moving the FPU state last in task_struct and then allocating
task_struct accordingly.

I.e. we can put the variable size state array at the end of
task_struct, make task_struct size per boot variable and still have
essentially a single static allocation for all fundamental task state.

But I first wanted to see people test this series - it's ambitious
enough as-is already!

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/