Wierd kernel crash (Motherboard bug?) after check_bugs

B. Craig Taverner (craig@ComOpt.com)
Mon, 26 Jan 1998 18:53:51 +0100 (CET)


Hi all,

I've just encountered the first machine where I've failed to install linux
because the installation kernel crashes (with hard reset) on bootup, in
the start_kernel() function of init/main.c

I tried to follow what was happening by putting extra printk() lines in
the check_bugs() functions in include/asm-i386/bugs.h, followed by little
for() delays (I was not sure if it would be safe to use sleep() in kernel
code), so that I could see what was happening.

Essentially at the end of the start_kernel() function, the thread passing
through this function goes into an infinite idle() loop. As soon as it
does that, the computer hard reboots. This seems very serious to me. I
decided to not try and find out what other possible kernel threads might
exist, or try any other way to follow the further processing of the
kernel, feeling rather out of my depth, and instead ask the guru's for
advice.

Has anyone seen such a crash before, and any ideas on what it can be. I
tried the obvious things of removing any non-critical boards (only network
and sound), and tried also recompiling another kernel and booting with
that -> same problem. It seems the crash occurs well before any device
probing and somewhere right at the beginning of the hardware setup.

Kernels I used were:
1 - stock debian installation 2.0.30
2 - stock debian installation 2.0.29
3 - custom built, very light on drivers 2.0.30 (built on debian 1.3.1)
4 - stock slackware installation 2.0.0

One interesting thing about the crash was that with an untouched kernel,
it appeared to happen immediately after the "checking 'htl'
instruction..." printout, leading me to look at that check first.
However, after all my for() loop delays were added, the thread lasted
through the tests and the kernel banner printing (ie. only one line
further on the screen, but a number of lines in the code), up till the
idle() loop (cpu_idle()). I have two hypotheses: 1 - the crash is caused
by another thread which waits for the start_kernel thread to go idle
before doing whatever causes the crash, 2 - the crash is caused by the
start_kernel thread, but only comes into affect when the thread goes idle.
I really don't have any further ideas.

Cheers, Craig

------
"Basic, n.:
A programming language. Related to certain social diseases in
that those who have it will not admit it in polite company."

======================================================================
Craig Taverner ------====== Email:craig@comopt.com
ComOpt AB ------======== Tel: +46-42-212580
Michael Löfmans Gata 6 ------========== Fax: +46-42-140475
SE-254 38 Helsingborg ------======== Cell: +46-708-212598
Sweden ------====== http://www.comopt.com
======================================================================