NMI trap and 2.1.37-7

Riccardo Facchetti (fizban@mbox.vol.it)
Tue, 13 May 1997 18:42:31 +0200 (MET DST)


On Fri, 9 May 1997, Martin Mares wrote:

[...]
> information based entirely on port 0x61 status (like the NMI routine in current
> pre-2.1.37-6). Maybe the rest should be a compile-time switch as most users don't
> need it at all. Probably the best solution would be to write a universal
> memory tester program doing these checks and much more.
>
> Anyway, the chipset code will probably have a NMI-hook for displaying ECC
> error data and PCI exceptions on some chipsets.

Hmmm .. yes I see. Probably the best way is to have a minimal error
detection mechanism compiled into the kernel and then, if
someone want to investigate on a memory error, use a memory tester in
userland to check his/her memory. NMI mechanism can be easily implemented
in userland deactivating NMI interrupts and using the NMI flag at 0x61,
the same as the code I wrote for do_nmi.

There is just one point, this for the developers.
Now the kernel have the io_check_error toggling the I/O CHCK bit, this
clears the I/O CHCK flag. I can't see any reason for not to toggle memory
errors bit too. Anyway these flags are not used so why toggling one and
not the other ?
There is another reason we should clear all two bits. If we have a mem
err, and then after some (maybe large, say hours) time a I/O CHCK err,
there will be reported a mem err and a I/O CHCK err because the mem err
flag was not cleared when do_nmi serviced the first interrupt.

Ciao,
Riccardo.