Re: FPU-intensive programs crashing with floating point exception on Cyrix MII

From: Chuck Ebbert
Date: Sun Aug 21 2005 - 04:51:10 EST


On Thu, 18 Aug 2005 12:37:30 +0200, Ondrej Zary wrote:

> > Could you modify this to print the full values of cwd and swd like this?
> >
> > printk("MATH ERROR: cwd = 0x%hx, swd = 0x%hx\n", cwd, swd);
> >
> > Then post the result.
> MATH ERROR: cwd = 0x37f, swd = 0x5020
> MATH ERROR: cwd = 0x37f, swd = 0x20
> MATH ERROR: cwd = 0x37f, swd = 0x20
> MATH ERROR: cwd = 0x37f, swd = 0x2020
> MATH ERROR: cwd = 0x37f, swd = 0x20
> MATH ERROR: cwd = 0x37f, swd = 0x1820
> MATH ERROR: cwd = 0x37f, swd = 0x1820
> MATH ERROR: cwd = 0x37f, swd = 0x2020
> MATH ERROR: cwd = 0x37f, swd = 0x20
> MATH ERROR: cwd = 0x37f, swd = 0x2800 <===========
> MATH ERROR: cwd = 0x37f, swd = 0x1820
> MATH ERROR: cwd = 0x37f, swd = 0x820
> MATH ERROR: cwd = 0x37f, swd = 0x2820
> MATH ERROR: cwd = 0x37f, swd = 0x2820
> MATH ERROR: cwd = 0x37f, swd = 0x1820
> MATH ERROR: cwd = 0x37f, swd = 0x820
> MATH ERROR: cwd = 0x37f, swd = 0x1a20

The error I marked has no exception flags set. The rest are all (masked)
denormal exceptions. Why your Cyrix MII would cause an FPU exception in these
cases is beyond me. Could you try the statically-linked mprime program?

I had hoped someone who knew more about FPU error handling would jump in.
The below code from arch/i386/kernel/traps.c sends a signal back to
userspace even when the status word shows a masked (or no) exception has
occurred. The 'case 0x000' strongly suggests this is deliberate but I
don't know why.


/*
* (~cwd & swd) will mask out exceptions that are not set to unmasked
* status. 0x3f is the exception bits in these regs, 0x200 is the
* C1 reg you need in case of a stack fault, 0x040 is the stack
* fault bit. We should only be taking one exception at a time,
* so if this combination doesn't produce any single exception,
* then we have a bad program that isn't syncronizing its FPU usage
* and it will suffer the consequences since we won't be able to
* fully reproduce the context of the exception
*/
cwd = get_fpu_cwd(task);
swd = get_fpu_swd(task);
switch (((~cwd) & swd & 0x3f) | (swd & 0x240)) {
case 0x000:
default:
break;
case 0x001: /* Invalid Op */
case 0x041: /* Stack Fault */
case 0x241: /* Stack Fault | Direction */
info.si_code = FPE_FLTINV;
/* Should we clear the SF or let user space do it ???? */
break;


(And it looks like there is a small bug in there. The switch should be:

switch (((~cwd) & swd & 0x3f) | (swd & 1 ? swd & 0x240 : 0)) {

because the SF and CC1 bits are only relevant when IE is set.)

__
Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/