Re: NMI error

From: Thomas HERAULT (Thomas.Herault@lri.fr)
Date: Wed Jul 25 2001 - 06:58:08 EST


On Wed, 25 Jul 2001 04:38:51 -0700 (PDT)
Kannan Soundarapandian <wskannan@yahoo.com> wrote:

> Hello,
>
> I am a grad student at the Oregon State univ. I have
> the same problem as you. I have a dual p3 1gig
> processor system with the Asus CUR-DLS motherboard
> (which uses the serverworks LE chipset), and 512 mb
> Registered samsung RAm as one module. I am also using
> a quantum atlas10k2scsi hdd with an on motherboard
> symbios controller.
>
> I get the following error continuously.. after which
> the system dies! I'm using redhat7.1.
>
> Can u please help me fix the problem??
>
> Uhhuh. NMI received. Dazed and confused, but trying to
> continue
> Do you have a strange power saving mode enabled?
>
> Please help. Thank you very much
>
> Kannan
>
>
> =====
> --
> Kannan Soundarapandian,
> OSU, Corvallis.
>
>

Hi.
I *almost* "solved" this problem :
Since linux-2.4.2-ac18, by default nmi_watchdog is disabled.
If you re-enable it with nmi_watchdog=1 as kernel boot parameter,
the symptom disapear.
But it is only a work around which works : the fact is that in
arch/i386/kernel/traps.c, there is a test,
  if (nmi_watchdog)
    {
        nmi_watchdog_tick(regs);
        return;
    }
Thus, our NMIs are catched an treated like watchdog NMIs,
but even if you disable nmi_watchdog (like it is by default),
these NMIs occure on my (our ?) CPUs (approximatively 300 interruption / second)
and produces these annoying error messages.

On other multi-processors motherboards I know, when you set
nmi_watchdog=0, the cat /proc/interrupts says that there is
0 NMI on every CPU ; on mine, I still have 300 NMI / sec.

For those of the lkm which are interested in this problem,
my motherboard is a 694D Pro (MS-6321) from MSI, with
a VIA VT82C694X chipset and an Apollo Pro133A north bridge,
featuring 2 Pentium III at 800 Mhz. Every kernel I tested
(2.2.x included) had the same problem.
The reasons of the interruptions are (2d then 3d)*

Hope that help and that someone could help us find what
produces these interruptions.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jul 31 2001 - 21:00:22 EST